Categories
c c-strings segmentation-fault

Why do I get a segmentation fault when writing to a “char *s” initialized with a string literal, but not “char s[]”?

323

The following code receives seg fault on line 2:

char *str = "string";
str[0] = 'z';  // could be also written as *str="z"
printf("%s\n", str);

While this works perfectly well:

char str[] = "string";
str[0] = 'z';
printf("%s\n", str);

Tested with MSVC and GCC.

1

  • 4

    Its funny – but this actually compiles and runs perfectly when using windows compiler (cl) on a visual studio developer command prompt. Got me confused for a few moments…

    Sep 13, 2016 at 8:30

270

See the C FAQ, Question 1.32

Q: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].

A: A string literal (the formal term
for a double-quoted string in C
source) can be used in two slightly
different ways:

  1. As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values
    of the characters in that array (and,
    if necessary, its size).
  2. Anywhere else, it turns into an unnamed, static array of characters,
    and this unnamed array may be stored
    in read-only memory, and which
    therefore cannot necessarily be
    modified. In an expression context,
    the array is converted at once to a
    pointer, as usual (see section 6), so
    the second declaration initializes p
    to point to the unnamed array’s first
    element.

Some compilers have a switch
controlling whether string literals
are writable or not (for compiling old
code), and some may have options to
cause string literals to be formally
treated as arrays of const char (for
better error catching).

6

  • 9

    Couple of other points: (1) the segfault happens as described, but its occurrence is a function of the run environment; if the same code was in an embedded system, the write may have no effect, or it may actually change the s to a z. (2) Because string literals are non-writable, the compiler can save space by putting two instances of “string” in the same place; or, if somewhere else in the code you have “another string”, then one chunk of memory could support both literals. Clearly, if code were then allowed to change those bytes, strange and difficult bugs could occur.

    – greggo

    Aug 26, 2011 at 23:24

  • 1

    @greggo: Good point. There is also a way to do this on systems with MMU by using mprotect to wave read-only protection (see here).

    – user405725

    May 2, 2013 at 13:40

  • So char *p=”blah” actually creates a temporary array ?weird.

    Dec 3, 2014 at 13:35

  • 1

    And after 2 years of writing in C++…TIL

    Dec 28, 2014 at 0:41

  • 2

    @rahul tyagi, Not a temporary array. Quite the opposite, it’s the longest lived of arrays. It’s created by the compiler and found in the executable file itself. What you should have understood from the above is that It’s a shared array that must be treated as read-only (and may actually be read-only).

    – ikegami

    Nov 15, 2019 at 10:03


113

Normally, string literals are stored in read-only memory when the program is run. This is to prevent you from accidentally changing a string constant. In your first example, "string" is stored in read-only memory and *str points to the first character. The segfault happens when you try to change the first character to 'z'.

In the second example, the string "string" is copied by the compiler from its read-only home to the str[] array. Then changing the first character is permitted. You can check this by printing the address of each:

printf("%p", str);

Also, printing the size of str in the second example will show you that the compiler has allocated 7 bytes for it:

printf("%d", sizeof(str));

4

  • 13

    Whenever using “%p” on printf, you should cast the pointer to void * as in printf(“%p”, (void *)str); When printing a size_t with printf, you should use “%zu” if using the latest C standard (C99).

    Oct 3, 2008 at 7:44

  • 4

    Also, the parenthesis with sizeof are only needed when taking the size of a type (the argument then looks like a cast). Remember that sizeof is an operator, not a function.

    – unwind

    Nov 25, 2008 at 8:45

  • 1

    and use %zu to print size_t

    – phuclv

    Apr 11, 2017 at 15:36

  • warning: unknown conversion type character ‘z’ in format [-Wformat=] :/

    – john

    Feb 26, 2021 at 10:02

43

Most of these answers are correct, but just to add a little more clarity…

The “read only memory” that people are referring to is the text segment in ASM terms. It’s the same place in memory where the instructions are loaded. This is read-only for obvious reasons like security. When you create a char* initialized to a string, the string data is compiled into the text segment and the program initializes the pointer to point into the text segment. So if you try to change it, kaboom. Segfault.

When written as an array, the compiler places the initialized string data in the data segment instead, which is the same place that your global variables and such live. This memory is mutable, since there are no instructions in the data segment. This time when the compiler initializes the character array (which is still just a char*) it’s pointing into the data segment rather than the text segment, which you can safely alter at run-time.

4

  • But isn’t it true that there can be implementations that allow modifying the “read-only memory”?

    – Pacerier

    Sep 21, 2013 at 5:07

  • When written as an array, the compiler places the initialized string data in the data segment if they are static or global. Otherwise (e.g. for a normal automatic array) it places on the stack, in the stack frame of the function main. Correct?

    – S E

    Dec 4, 2019 at 2:15

  • @SE Yeah, I would imagine that Bob Somers is referring to both the stack, heap and static (including static and global variables) when writing “the data segment”. And a local array is put on the stack, so you’re correct there 🙂

    – Olov

    Dec 27, 2020 at 1:44

  • Sorry, but you are probably correct here, The data segment is the part of the memory dedicated for initialized global or static variables, but the array could also be put on the stack if it is local, as you’ve written.

    – Olov

    Dec 27, 2020 at 1:53