The C++ string Class - A Short Primer

The C++ string class is a safe and convenient alternative to traditional C strings, which are simply null-terminated arrays of characters (char *'s). Most programmers find them easier and more intuitive to use than char pointers, and they free the user from the messy details of memory allocation, reallocation, copying, and cleanup. This, in turn, tends to help prevent memory leaks and improper memory accesses. The drawback is a slight penalty in efficiency, but the advantages usually outweigh this.

When reading this primer, please note that many of the examples here are simple "dummy examples", some of which may seem to do unnecessary things or to use C++ string's in situations where traditional C-style strings would suffice or perhaps even be simpler. These are intended for illustration purposes only; hopefully it will become apparent in practice the usefulness of the various utilities presented here.

Making string's Available For Use

To use C++ string's, you need to have the following lines in your program, before you declare any string's:

#include <string>

using namespace std;

If you don't want to open up the entire std namespace (which contains everything in the C++ standard library), you can replace the second line with:

using std::string;

Declaring and Initializing string's

You declare an object of type string just like any other type:

string s;

You can initialize a string with another string, or even a C string constant. Here is a common way that you see string's initialized:

string s = "Hello World!";

Similarly, string variables can be assigned the values of other string's or of C string constants:

string s, t;

s = "Hello World!"

t = s;

C++ string's have copy semantics when assigning and initializing, which means that the value of the string on the right hand side is, conceptually, copied to the string on the left hand side, rather than having both string's actually point to the same area of memory as is the case with assignment of char *'s. This replaces the awkward and error-prone practice in C of explicitly allocating memory and performing a strcpy(). It may seem at first like casually using assignment in this manner may lead to a lot of unnecessary copying, since string's which are only being read from do not actually need to be copied. However, most implementations of the C++ library are written intelligently so that actual copies are only done when string's are written to. Therefore, it's OK to use string as a function parameter type and as a return type. The string class also takes care of deleting old memory if new memory has to be allocated, helping to avoid memory leaks.

Remember also that C string constants are NOT of type string (they are of type const char *), but can be implicitly converted to type string in initializations, assignments, or as arguments in function calls:

void foo(string s);

....

foo("Hello World");

You can also initialize string's with portions of other string's:

string s = "Well, Hello World, What's Up?";

string t(s, 6, 11);

Note that we are using explicit construction here, providing arguments to the string constructor to create t, rather than using implicit conversion with the = initialization operator. The first parameter specifies the location in the source string at which to begin copying. The second specifies the number of characters to copy. Thus, in this example, the string t would receive the value "Hello World". You can also perform an assignment version of this copying using the assign() member function:

string t;

t.assign(s, 6, 11);

I/O with string's

Performing I/O with string's is very easy; you can use the << and >> operators with the C++ stream classes to input and output strings just like you would any fundamental type:

string s;

cin >> s;

cout << s;

On inputs, any whitespace character is used as a separator, so you cannot read in empty string's or string's with spaces using the >> operator. If you wish to use a newline as a separator, so that you can read a line of text with spaces (possibly empty), use the getline() function:

getline(cin, s);

Note that C++ string input is very convenient compared to input of C-style strings using char *'s because you don't have to worry about the size of the input due to the limited amount of memory allocated for a char array. C++ string's will grow to accommodate the size of any value input.

When string's are output, each character is output, spaces and all.

String Access

You can access individual elements of a string with the subscript operator ([ ]), just like with char arrays:

string s = "Hello World!";

for (int i = 0; i < s.length(); i++)

cout << s[i];

Notice how you can obtain the length of a string using the length() member function (the size() member function does the same thing).

You can also use the subscript operator to assign to individual characters, like you can with a character array:

s[11] = '?';

Note that, unlike with C-style strings, there is not a null at the end of the string; that is, s[s.length()] is not, generally, the null character. With C++ string's, the terminating null is not necessary, since the length of the string is maintained internally. This allows a string object's length to be determined as a constant-time operation, as opposed to using the C strlen() function.

Note: You may also wish to read the optional section at the end of this primer titled C++ string Iterators.

String Operations

C++ string's can be concatenated with the simple use of the + operator:

string s = "Hello";

string t = "World";

string u = s + " " + t;

Note that the single-space C-style string constant " " can be used as an operand to + because it will be implicitly converted to a string. However, to do this, you need at least one of the operands to be of type string. So the code:

char *s = "Hello", *t = "World";

string u = s + " " + t;

would be incorrect, but you could write:

string u = string(s) + " " + t;

You can also append characters to the end of a string using the append() member function:

u.append(1, '!');

The 1 specifies how many times you wish to append the character.

You can also insert characters and other string's or C-style strings into a string using the insert() member function:

string s = "Hello"

s.insert(5, "World!");

s.insert(5, 1, ' ');

In each call to insert(), the first parameter specifies the position (0 based) at which to insert the character or string. In the second case, inserting a single character requires a parameter specifying the number of times to insert, as in the case of append().

Note that with all of these operations, the user is freed from the details of having to compute string lengths, possibly allocate new memory, and invoke the strcpy() or strcat() functions. Again, the string object being copied to will grow as needed to accommodate its new value, and will clean up old memory when new memory is allocated, avoiding a possible memory leak.

Converting From C++ string's to C-style Strings

At times, you will want to obtain a read-only C-style string from a C++ string. You can do this by calling the c_str() member function. This tends to be very useful when interfacing with old libraries and code that use C-style strings instead of C++ string's. One situation in which this often arises is with filenames used to open files with C++ file stream objects:

string s = "myfile.txt"

ifstream infile(s);

is incorrect, since the ifstream constructor expects an argument of type const char *, but you can write:

ifstream infile(s.c_str())

Note that the pointer value returned by c_str() does point to the string object's internal array (so you don't have to worry about deleting it as you would if it were a copy), but it is of type const char *. Thus, unlike the begin() function (see the optional section C++ string Iterators) the value returned here cannot be used to modify the string. The c_str() function appends a null to the end of the string's array (since this array is not normally null-terminated). Modifying the original string object from which this C-style string came can invalidate it.

C++ string Iterators (optional)

If you wish to obtain a "pointer" to the actual elements of the string, you can do so with the begin() and end() member functions:

string s = "Hello World!";

string::const_iterator iter = s.begin();

string::const_iterator end = s.end();

while (iter != end)

cout << *iter++;

The types string::iterator and string::const_iterator are internally defined types meant to mimic the properties of types char * and const char *, respectively. In many ways, they can be used in the same way as char *'s (they can be assigned, incremented, decremented, subscripted, and dereferenced). It should never be assumed that they actually are char *'s, though. In some implementations that will be the case, but in others, it will not, and your code becomes less portable if you make this assumption. Note again that begin() does not return a pointer to a null terminated C-style string (like the c_str() function; see above), so you must use end(), which points to one-past-the-end of the string, or length(), to determine where the end of the string lies.

Though string iterators can be very useful, they are only necessary for advanced operations, and a full discussion of their use requires a discussion of the C++ standard header <algorithm>, which contains a number of utilities that fully exploit iterators. The functionality of the string class described so far is enough to allow you to perform most any basic string operation without using iterators.