Jun 23, 2016

C Programming #72: Strings

C Programming #72: Strings

strings in C is a non primitive data-type. To understand strings we need to know a little bit of arrays and pointer. You can review array and pointer. This article explains all things about strings and its usage in C.


Basic dictionary definition of string is.

a set of things tied or threaded together on a thin cord

In C, strings means characters that are together, e.g Hello, World. Array generally store contiguous data of same type. Hence we could use a character array to store string.

// Character Array
char a[] = {'h','e','l','l','o',' ','w','o','r','l','d'};

One issue with above representation is that when will compiler know where the string ends, It would be good to have some de-limiter in the end that would let compiler know that string ends here. For strings '\0' is used as delimiter. (Note it is not number 0, it is character with ASCII value 0, to know more about ASCII read this article). Hence now the string would look as follows

// String
char a[] = {'h','e','l','l','o',' ','w','o','r','l','d','\0'};

This single concept would result in programmers nightmare.

  1. Strings take one extra byte to store this delimiter.
  2. Without having the delimiter, all data in memory after the end as part of string.

Instead of assigning using above syntax, C provides a better one.

char a[] = "hello world";

Note in the above definition, "hello world" is string literal constant. There is no need to end string with \0 here as compiler automatically does it. We can print the string variable using %s type specifier in printf as follows. We could modify it just like array manipulation

#include <stdio.h>
int main()
{
    char a[] = "hello world";
    printf("%s\n", a);
    a[6] = 'i';
    a[7] = 'n';
    a[8] = 'd';
    a[9] = 'i';
    a[10] = 'a';
    printf("%s\n", a);
    return 0;
}
hello world
hello india

We could also have pointer to string as follows

char *p = "hello world";

We could print this pointer to string just like before

#include <stdio.h>
int main()
{
    char *p = "hello world";
    printf("%s\n", p);
    p[6] = 'i';
    p[7] = 'n';
    p[8] = 'd';
    p[9] = 'i';
    p[10] = 'a';
    printf("%s\n", p);
    return 0;
}

If i compile and execute the above program as below, i get

$ gcc a.c
$ ./a.out 
hello world
segmentation fault (core dumped)

There is run time error now and the a.out crashed. More info would be provided how to debug a crash and what is run time error in further articles. In simple words when the binary is being run if there is a error e.g segmentation fault (core dumped) then it is called run time error. Here p is just pointer to constant string "hello world", this string would be stored in text segment of the binary, which is read only. ~p\[6\] = 'i'~ resulted in write to be issued to read only segment, hence the segmentation fault.

Links

No comments :

Post a Comment