SETTINGS
Appearance
Language
About

Settings

Select a category to the left.

Appearance

Theme

Light or dark? Choose how the site looks to you by clicking an image below.

Light Dark

Language

Preferred Language

All content on utk.claranguyen.me is originally in UK English. However, if content exists in your preferred language, it will display as that instead. Feel free to choose that below. This will require a page refresh to take effect.

About

"utk.claranguyen.me" details

Domain Name: claranguyen.me
Site Version: 3.0.1
Last Updated: 2019/08/18
Synopsis
You've done come far, nearing completion of CS302. So what's next? If you are in the Computer Science program, you will be taking CS360 next. If you have been paying attention in lab, you might've caught on that CS360 isn't in C++. It's in C, and you will be expected to pick it up and do some challenging assignments with it right away. This guide will help "smoothen" that transition, showing you how to do some stuff in C that I noticed are common mistakes among "beginners".

There are a lot of example programs I give here. If you want a directory listing of all of the files, you can find it here: http://file.claranguyen.me/index.php?path=/guide/beyond_cs302
"Hello World" and compilation
Let's build the most basic of programs, just to lay some kind of foundation. If you're lucky and had CS130 in the recent semesters, you are already familiar with some C commands like printf. If not, it's time you learned. It's basically the C way of printing data to standard out (much like C++'s cout). Behold the following C file hello_world.c:
File (hello_world.c)
#include <stdio.h>

int main() {
	printf("Hello World!\n");

	return 0;
}

Looks pretty familiar to C++ huh? Alright, let's compile it. Unlike C++ code (which used g++), we are using a different command to compile this time. Behold gcc:
UNIX Command
UNIX> gcc -o hello_world hello_world.c

And if we run it, we get our expected output:
UNIX Command
UNIX> ./hello_world
Hello World!
Strings and string comparisons

Introducing C-strings

Did you know? The string type in C++ is actually a class! Shocker. Unfortunately, C doesn't have classes. We are forced to use C-style strings. Yup, this means char* and unsigned char*.

Let's try using one:
File (string_ex1.c)
#include <stdio.h>
#include <string.h>

int main() {
	char *str = "Hello World!";
	printf("%s\n", str);

	//Get string length and print it
	size_t str_size = strlen(str);
	printf("The string's size is: %d\n", str_size);

	return 0;
}

Here, I introduce the #include <string.h> line, as well as strlen. This function (as well as many others involving C-strings) is included in "string.h", and it computes the size of a given C-string.

I'd use this function very sparingly. Run it once and store the value if you must. Even on modern hardware, a program checking millions or billions of strings this way can slow your program down tremendously. Unlike C++, where the size of the string is stored in the string class (making it a constant time lookup), strlen is a O(N) function, as it has to search until it hits a null terminator.

Of course, if we compile and run this we get:
UNIX Commands
UNIX> gcc -o string_ex1 string_ex1.c

UNIX> ./string_ex1
Hello World!
The string's size is: 12

String Comparisons

Another function you're going to see used a lot is strcmp. This function does what the name implies... it compares two strings and returns whether they are the same or not. The twist, though, is that this function has an unorthodox return method. It will return 0 if the strings are the same, and will return a non-zero value if the strings aren't the same. If the strings aren't the same, it'll return the difference between the first 2 characters it found that are different.

Let's try this out. Here is a program that takes 2 arguments from the terminal and tells whether they are the same or not:
File (string_ex2.c)
#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
	//Argument check
	if (argc != 3) {
		fprintf(stderr, "usage: %s str1 str2\n", argv[0]);
		return 1;
	}

	printf("str1 = %s\nstr2 = %s\n", argv[1], argv[2]);

	//Let's see if they are the same...
	int comp = strcmp(argv[1], argv[2]);

	if (comp == 0)
		printf("Both strings are the same!\n");
	else
		printf("The strings are different (ret %d).\n", comp);

	return 0;
}

Here's 3 trial runs. One without arguments (to show the "usage" prompt), one with matching strings, and one with different strings:
UNIX Command (Compilation)
UNIX> gcc -o string_ex2 string_ex2.c
UNIX Command (Run 1)
UNIX> ./string_ex2
usage: ./string_ex2 str1 str2
UNIX Command (Run 2)
UNIX> ./string_ex2 "Hello World" "Hello World"
str1 = Hello World
str2 = Hello World
Both strings are the same!
UNIX Command (Run 3)
UNIX> ./string_ex2 "Hello World" "Hello CS360"
str1 = Hello World
str2 = Hello CS360
The strings are different (ret 20).

Wonder how that 20 was computed? strcmp stops at the first character it finds a difference at. Then, it returns the difference between the two differing characters. In this case, it stops at 'W' in "Hello World" and 'C' in "Hello CS360". 'W' is 0x57 and 'C' is 0x43. With some basic hexadecimal math, we find that 0x57 - 0x43 = 0x14 (which is 20). In most cases, you won't need this. You just need to check if strcmp returns 0 or not.

"Yeah that's nice and all, but why can't I use ==?"

Because char * is a pointer and you'd be comparing two pointers. Last I checked, argv[1] isn't at the same address as argv[2] so, no matter the contents of both, == will always fail. This isn't C++, so there's no operator== overloaded to do the comparison manually either. So unless you do something useless like if (argv[1] == argv[1]), there is no way you'll get == to return true. Sorry, that's just how it is.
Structs and Classes
You've come to accept that C++ gives you a lot of power with classes. For instance, in C++, you're able to operator overload, make custom functions, use templates, etc. Well, that's all gone now... There are no classes in C. However, there are structs! Those help ease the pain a little. However, the syntax for them is slightly different. Hear me out.

In C++, you're used to seeing a struct defined like this:
File (struct_ex1.cpp)
struct thing {
	int a, b, c;
	string d;
};

int main() {
	thing foo, bar;

	foo.a = 2;
	/* etc */
}
That's fine and all. But if we do that in C, it'll fail. Here's it in C-style:
File (struct_ex1.c)
struct thing {
	int a, b, c;
	char *d;
};

int main() {
	struct thing foo, bar;

	foo.a = 2;
	/* etc */
}
Notice something? Look at the line below int main() {. Yeah, we had to put "struct" in front of the type when declaring variables in main. We can get around this with a typedef. Behold:
File (struct_ex2.c)
typedef struct {
	int a, b, c;
	char *d;
} thing;

int main() {
	thing foo, bar;
}

There are a few other differences as well:
  • There is no such thing as public, protected, and private. Everything is public and accessible.
  • You can't have functions in your structs, unlike in C++.
    • In C-style coding, this can be worked around with Function Pointers, but those are not advised.
    • To properly overcome this, programmers usually make functions that pass a pointer to the struct in. In C++, the compiler actually does this with class/struct functions by adding a pointer as the very first argument. This is where that this pointer comes from.
  • There is no inheritance or polymorphism in C. However, it can be mimicked rather easily if you know what you're doing.

"So how will I make functions for structs? Can I simulate classes?"

Sure. Assume the following C++ program:
File (struct_func_ex.cpp)
struct thing {
	int id;

	void set_id(int v) { id = v; }
};

int main() {
	thing foo;

	foo.set_id(2);
}
In C, structs can't have functions. So they must be rewritten. Here's a C-like way to do this:
File (struct_func_ex.c)
typedef struct {
	int id;
} thing;

void thing_set_id(thing* this, int v) {
	this->id = v;
}

int main() {
	thing foo;

	thing_set_id(&foo, 2);
}
The function has been moved out of the struct, and now takes an extra argument, being a pointer to the struct we want to modify. Sadly, C doesn't have pass-by-reference. We are forced to resort to pointers.

Really, that's it. It's just a rewriting to move the function outside the struct. Again, this is what a C++ Compiler does to your class/struct code to generate the this pointer. It isn't too bad.
Man/Manual Pages
For C++, you were probably accustomed to going to a website like cplusplus.com. With C, there is another option on UNIX-based systems, man pages. How do you use them? Just open up your terminal and type man and a C function. Check it out:

"I did man printf and got the UNIX command variant!"

Yes, in bash, there is a printf command that can be used along with ls, mkdir, etc. To find the man page for the C version, type the following:
UNIX Command
UNIX> man 3 printf

"I have the man command but none of the commands have pages!"

Man pages are handled in differently depending on your computer and OS (yes, even different flavours of Linux). Depending on the OS, you may have to install a package to get the C programming manual. On UTK's Hydra/Tesla computers, man pages for commands you will use in CS360 will already be present.
Dynamic Allocation/Deallocation of Memory
In C++, you had keywords like new and delete (as well as their [] variants) to help you create dynamic arrays or just allocate a bit of data on-the-spot. These had the benefit of also calling class constructors/destructors, and they can be operator overloaded (if you really needed that...). Well, in C, you don't have these keywords. You don't need them either since there's no such thing as a class. Instead, C uses malloc and free, which are functions to allocate bytes of memory.

Here's the man page entry for malloc. It includes the entire "family" of functions.
Man page for "malloc"
#include <stdlib.h>

void *malloc(size_t size);
void free(void *ptr);
void *calloc(size_t nmemb, size_t size);
void *realloc(void *ptr, size_t size);
void *reallocarray(void *ptr, size_t nmemb, size_t size);

To aid in showing how these functions work, let's start with something familiar. Here's some C++ code that you're probably familiar with:
File (new_delete.cpp)
#include <iostream>
#include <string>

using namespace std;

int main() {
	int* a = new int[10];
	int i;

	for (i = 0; i < 10; i++)
		a[i] = i;

	//Print out the values in the array
	for (i = 0; i < 10; i++)
		cout << a[i] << " ";

	cout << endl;

	//Free memory
	delete[] a;
}
And it prints out the following:
Output
0 1 2 3 4 5 6 7 8 9 
Now let's look at it in C.
File (malloc_free.c)
#include <stdio.h>
#include <stdlib.h>

int main() {
	int* a = (int *) malloc(sizeof(int) * 10);
	int i;

	for (i = 0; i < 10; i++)
		a[i] = i;

	//Print out the values in the array
	for (i = 0; i < 10; i++)
		printf("%d ", a[i]);

	printf("\n");

	//Free memory
	free(a);
}
Structurally, it looks the same. Other than the output statements being changed to printf, though, what else has changed? The lines where new and delete were called were changed to malloc(...) and free(...) respectively. Confused? Let's break it down.

Breaking down malloc

malloc is very easy to mess up. The man page gives the following definition:
Man Page for "malloc"
void *malloc(size_t size);
So it takes an integer size which is the number of bytes we want to allocate in memory. It will return a void * pointer to that allocated memory. This pointer should be casted to the target type afterwards (hence the (int *) cast in the code above). Simple enough.

So if we want an array of 10 ints, we will call the following:
int* a = (int *) malloc(sizeof(int) * 10);
sizeof is typically used to get the size of a data type in bytes. Assume int is 4 bytes. Thus, sizeof(int) = 4. Since we want 10 of these integers (40 bytes) we have to perform the multiplication above.

Lastly, malloc (like new) can fail. You can check this by checking the pointer returned. If it's NULL, it failed.

Breaking down free

The man page gives the following definition:
Man Page for "free"
void free(void *ptr);
It just takes a pointer. Pass in the pointer that was returned by malloc earlier and your memory is freed. It really is that simple.

In the code above, it's called in the following fashion:
File (malloc_free.c)
int* a = (int *) malloc(sizeof(int) * 10);

/* do stuff */

free(a);


"Why should I care about free? Computers these days have gigabytes of RAM!"

And yet they can still be brought to their knees. In C++, the heavy-lifting memory management is handled for you by data structures like vector, map, etc. In C though, you will be handling it all yourself. All it takes is allocation of strings, and duplicating them a few million times in an infinite loop or something and you're out of memory. All the higher RAM means is that it'll just take a tad bit more effort to run out of memory. It doesn't mean you won't run out of memory.

Plus, assuming your CS360 TA isn't lazy, usually they are instructed to use a tool like valgrind to check if your program leaks memory or not. I know my TA was instructed to do that. And they deducted points if they saw a memory leak.
Other memory functions

Another choice: calloc

One of the catches of malloc is that the memory allocated isn't set to 0 by default (most of the time). To compensate for this, there is another choice, and it's one of my favourites. Behold calloc:
Man page for "calloc"
void *calloc(size_t nmemb, size_t size);
Functionally, it's the exact same as malloc, except it also sets the memory to 0 for you. It also takes two arguments instead of one... but it's obvious how to change a malloc call to this. nmemb is "number of members", or the number of elements in the array. size is the size of the data type to be allocated. Let's check it out:
File (calloc.c)
#include <stdio.h>
#include <stdlib.h>

int main() {
	int* a = (int *) calloc(10, sizeof(int));
	int i;

	//Print out the values in the array
	for (i = 0; i < 10; i++)
		printf("%d ", a[i]);

	printf("\n");

	//Free memory
	free(a);
}
And the output is...
Output
0 0 0 0 0 0 0 0 0 0 
All 0's, just as calloc guarantees.

"Resizing" of data: realloc

You know how C++ vectors are contiguous memory containers (resizable arrays)? It was covered in CS102. Ever wonder how they work? In C++, usually you would just call new to allocate an array of the larger (or smaller) size. Then you would just copy each element over to the new array. In C, you can do the same with malloc, but there is another option for you... realloc:
Man page for "realloc"
void *realloc(void *ptr, size_t size);
This function has similar syntax to the other family of "malloc"-like functions. It'll take memory that has been allocated with either malloc or calloc and... well... "resize" it. It returns a void *, which is the address to the resized array.

Let's check it out with an example similar to the ones above:
File (realloc.c)
#include <stdio.h>
#include <stdlib.h>

int main() {
	//Allocate memory. I think 10 would be fine.
	int* a = (int *) malloc(sizeof(int) * 10);
	int i;

	//Actually... I want 15...
	a = (int *) realloc(a, sizeof(int) * 15);

	//Put values in the array
	for (i = 0; i < 15; i++)
		a[i] = i;

	//Print out the values in the array
	for (i = 0; i < 15; i++)
		printf("%d ", a[i]);

	printf("\n");

	//Free memory
	free(a);
}
Of course, we can see the output would then be:
Output
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Just like the others, realloc can fail. The way to check if it failed is the same as the others. Check the pointer that the function returns. If it's NULL, you know the function failed to allocate memory.
Conclusion
I hope this all helps! If you have had me as a TA in any prior classes, you will have seen me using some of these live in lab on the projector. Hopefully, since you managed to get past CS302, you'll be able to pick up most of these concepts swiftly. You'll need to, since you're expected to use C from the very start of the course. Best of luck!
Guide Information
Basic Information
Name: Beyond CS302 (Preparing for CS360)
Description: So what's next?
ID: beyond_cs302
File Information
File Size: 22.47 KB (23010 bytes)
Last Modified: 2020/01/03 23:18:03
Version: 1.0.0
Translators
en-gb Clara Nguyen (iDestyKK)