Class Design in C++


Understanding InterfacesWhen you’re designing a class in C++, the first thing you should decide is the public interface for the class. The public interface determines how your class will be used by other programmers (or you), and once designed and implemented it should generally stay pretty constant. You may decide to add to the interface, but once you’ve started using the class, it will be hard to remove functions from the public interface (unless they aren’t used and weren’t necessary in the first place). 

But that doesn’t mean that you should include more functionality in your class than necessary just so that you can later decide what to remove from the interface. If you do this, you’ll just make the class harder to use. People will ask questions like, “why are there four ways of doing this? Which one is better? How can I choose between them?” It’s usually easier to keep things simple and provide one way of doing each thing unless there’s a compelling reason why your class should offer multiple methods with the same basic functionality. At the same time, just because adding methods to the public interface (probably) won’t break anything that doesn’t mean that you should start off with a tiny interface. First of all, if anybody decides to inherit from your class and you then choose a function with the same name, you’re in for a boatload of confusion. First, if you don’t declare the function virtual, then an object of the subclass will have the function chosen depending on the static type of the pointer. This can be messy. Moreover, if you do declare it virtual, then you have the issue that it might provide a different type of functionality than was intended by the original implementation of that function. Finally, you just can’t add a pure virtual function to a class that’s already in use because nobody who has inherited from it will have implemented that function. The public interface, then, should remain as constant as possible. In fact, a good approach to designing classes is to write the interface before the implementation because it’s what determines how your class interacts with the rest of the world (which is more important for the program as a whole than how the class is actually implemented). Moreover, if you write the interface first, you can get a feel for how the class will work with other classes before you actually dive into the implementation details.

Inheritance and Class Design

The second issue of your class design is what should be available to programmers who wish to create subclasses. This interface is primarily determined by virtual functions, but you can also include protected methods that are designed for use by the class or its subclasses (remember that protected methods are visible to subclasses while private methods are not). A key consideration is whether it makes sense for a function to be virtual. A function should be virtual when the implementation is likely to differ from subclass to subclass. Vice-versa, whenever a function should not change, then it should be made non-virtual. The key idea is to think about whether to make a function virtual by asking if the function should always be the same for every class. For example, if you have a class is designed to allow users to monitor network traffic and you want to allow subclasses that implement different ways of analyzing the traffic, you might use the following interface:


class TrafficWatch

{
public:

is some class that implements information about network
// packet

// Packet
s
void addPacket (const Packet& network_packet);


ket ();

virtual bool isOver

int getAveragePacketSize ();

int getMaxPa
cloaded ();


};

In this class, some methods will not change from implementation to implementation; adding a packet should always be handled the same way, and the average packet size isn’t going to change either. On the other hand, someone might have a very different idea of what it means to have an overloaded network. This will change from situation to situation and we don’t want to prevent someone from changing how this is computed–for some, anything over 10 Mbits/sec of traffic might be an overloaded network, and for others, it would require 100 Mbits/sec on some specific network cables. Finally, when publicly inheriting from any class or designing for inheritance, remember that you should strive for it to be clear that inheritance models is-a. At heart, the is-a relationship means that the subclass should be able to appear anywhere the parent class could appear. From the standpoint of the user of the class, it should not matter whether a class is the parent class or a subclass. To design an is-a relationship, make sure that it makes sense for the class to include certain functions to be sure that it doesn’t include that subclasses might not actually need. One example of having an extra function is that of a Bird class that implements a fly function. The problem is that not all birds can fly–penguins and emus, for instance. This suggests that a more prudent design choice might be to have two subclasses of birds, one for birds that can fly and one for flightless birds. Of course, it might be overkill to have two subclasses of bird depending on how complex your class hierarchy will be. If you know that nobody would ever expect use your class for a flightless bird, then it’s not so bad. Of course, you won’t always know what someone will use your class for and it’s much easier to think carefully before you start to implement an entire class hierarchy than it will be to go back and change it once people are using it.

Hash Tables


Hash tables are an efficient implementation of a keyed array data structure, a structure sometimes known as an associative array or map. If you’re working in C++, you can take advantage of the STL map container for keyed arrays implemented using binary trees, but this article will give you some of the theory behind how a hash table works.

Keyed Arrays vs. Indexed Arrays

One of the biggest drawbacks to a language like C is that there are no keyed arrays. In a normal C array (also called an indexed array), the only way to access an element would be through its index number. To find element 50 of an array named “employees” you have to access it like this:
1
employees[50];
In a keyed array, however, you would be able to associate each element with a “key,” which can be anything from a name to a product model number. So, if you have a keyed array of employee records, you could access the record of employee “John Brown” like this:
1
employees["Brown, John"];
One basic form of a keyed array is called the hash table. In a hash table, a key is used to find an element instead of an index number. Since the hash table has to be coded using an indexed array, there has to be some way of transforming a key to an index number. That way is called the hashing function.

Hashing Functions

A hashing function can be just about anything. How the hashing function is actually coded depends on the situation, but generally the hashing function should return a value based on a key and the size of the array the hashing table is built on. Also, one important thing that is sometimes overlooked is that a hashing function has to return the same value every time it is given the same key.
Let’s say you wanted to organize a list of about 200 addresses by people’s last names. A hash table would be ideal for this sort of thing, so that you can access the records with the people’s last names as the keys.
First, we have to determine the size of the array we’re using. Let’s use a 260 element array so that there can be an average of about 10 element spaces per letter of the alphabet.
Now, we have to make a hashing function. First, let’s create a relationship between letters and numbers:
A --> 0
B --> 1
C --> 2
D --> 3
...
and so on until Z --> 25.
The easiest way to organize the hash table would be based on the first letter of the last name.
Since we have 260 elements, we can multiply the first letter of the last name by 10. So, when a key like “Smith” is given, the key would be transformed to the index 180 (S is the 19 letter of the alphabet, so S –> 18, and 18 * 10 = 180).
Since we use a simple function to generate an index number quickly, and we use the fact that the index number can be used to access an element directly, a hash table’s access time is quite small. A linked list of keys and elements wouldn’t be nearly as fast, since you would have to search through every single key-element pair.

Collisions and Collision Handling

Problems, of course, arise when we have last names with the same first letter. So “Webster” and “Whitney” would correspond to the same index number, 22. A situation like this when two keys get sent to the same location in the array is called a collision. If you’re trying to insert an element, you might find that the space is already filled by a different one.
Of course, you might try to just make a huge array and thus make it almost impossible for collisions to happen, but then that defeats the purpose of using a hash table. One of the advantages of the hash table is that it is both fast and small.

Collision handling with open addressing

The simplest collision handling algorithm is known as the open address method or the closed hashing method. When you are adding an element, say “Whitney,” and you find that another element is already there (“Webster,” for instance) then you would just proceed to the next element space (the one after “Webster”). If that is filled, you go on to the next one, and so on, until you find an empty space to insert the new element (all those extra elements came in handy after all!)
...
220 "White" | <-- ### COLLISION ### : Gotta move on to the next.
221 "Webster" | <-- ### COLLISION ### : Next one.
222 | Ahhh, perfect. Insert Here.
223 |
...

Since we modified the insertion algorithm, we also have to change the function that finds the element. You have to have some way of verifying that you’ve found the element you want, and not some other element. The simplest way is to just compare keys. (Does this record have the last name “Whitney”? Does this one?) If the element you find is not one of them, just move on to the next element until you reach the one you want or you find an empty space (which means the element is not in the table).

Sounds simple, right? Well, it gets more complicated. What if you have so many collisions that you run off the end of the array?
If you’re trying to insert “Zorba” and all the elements are filled because of the collision handling, then what? Look at the example:
...
258 "Whitney" | <-- Nope, not Empty
259 "Zeno" | Nope, not Empty
---------------- <-- Ummm, what now?
The easiest thing to do is to just wrap around to the beginning again. If there are still no empty spaces, then we have to resize the array, since there isn’t enough space in the hash table for all of the elements. If we resize the array, of course, we’ll have to come up with a tweak to our hash function (or at least how we handle it) so that it covers the right range of values again, but at least we’ll have room. (Note that resizing the array means that occasionally inserting a value into the list will cause an O(n) copy operation to take place, but that on average this should happen only once for every n items inserted, so insertion should be on average constant time, O(1). (If you aren’t sure what terms like “O(n”) and “constant time” mean, take a look at the Cprogramming.com article series on algorithmic efficiency.) As you can see, resizing isn’t all that bad–still, if you know the amount of space you will need to start with, you can save your program some work.

Handling collisions with separate chaining

A second collision handling strategy is to store a linked list at each element in the hash data structure. This way, when a collision occurs, you can just add the element into the linked list that is stored at the hash index. If you have only a single element with a particular hash value, then you have a single element list–no performance penalty. If you have a lot of elements hashing to the same value, you’ll see a slowdown of course, but no more than you otherwise would see with hash collisions.
One nice thing about separate chaining is that having a bunch of values that hash “near” each other is less important. With open addressing, if you have a cluster of values that hash to nearly the same value, you’ll run out of open space in that part of the hash. With separate chaining, each element that has a different hash value will not impact the other elements.

Resizing dynamically based on a load factor

Generally speaking, you wouldn’t want your hash table to grow completely full because this will make lookups take much longer. If a value isn’t in the array, with open addressing, you have to keep looking until you hit an empty location or you get back to the starting point–in other words, with a completely full table, lookups could be O(n), which is horrible. A real hash table implementation will keep track of its load factor, the ratio of elements to array size. If you have a 10 element array, with 7 elements, the load factor is 0.7. In fact, 0.7 is generally about the right time to resize the underlying array.

Choosing a Good Hash Algorithm

The more collisions you have, the worse the performance of your hash table will be. With enough elements in your hash table, you can get an average performance that’s quite good–essentially constant time O(1). (The trick is to make the array grow over time as you start to fill up the array.) But if you have a lot of elements that hash to the same value, then you will have to start doing a lookup through a list of elements that all have the same hash value. This can make your hash lookups go from constant time to being, well, linear time in the number of elements. Imagine if your hash function hashed all values to 0, putting them in the first element of the array. Then it would be just a really complicated way of implementing a linear search.
Choosing a good hash algorithm can require some care and experimentation, and it will depend on your problem domain. If you’re working with names, you probably don’t want a hash algorithm that just looks at the first letter, because the letters of the alphabet are not used evenly–you’ll find a lot more names that start with S than with Z. You also want to have your hash functions be fast–you don’t want to lose all the time savings you’re getting from the hash table because you’re computing the hash function really slowly. It’s a delicate balance..

Bitwise Operators in C and C++: A Tutorial


Generally, as a programmer you don’t need to concern yourself about operations at the bit level. You’re free to think in bytes, or ints and doubles, or even higher level data types composed of a combination of these. But there are times when you’d like to be able to go to the level of an individual bit. Exclusive-or encryption is one example when you need bitwise operations.

GA_googleFillSlot(“RunOfSite_LargeRectangle_ATF_336x280”);

Another example comes up when dealing with data compression: what if you wanted to compress a file? In principle, this means taking one representation and turning it into a representation that takes less space. One way of doing this is to use an encoding that takes less than 8 bits to store a byte. (For instance, if you knew that you would only be using the 26 letters of the Roman alphabet and didn’t care about capitalization, you’d only need 5 bits to do it.) In order to encode and decode files compressed in this manner, you need to actually extract data at the bit level.

Finally, you can use bit operations to speed up your program or perform neat tricks. (This isn’t always the best thing to do.)

Thinking about Bits

The byte is the lowest level at which we can access data; there’s no “bit” type, and we can’t ask for an individual bit. In fact, we can’t even perform operations on a single bit — every bitwise operator will be applied to, at a minimum, an entire byte at a time. This means we’ll be considering the whole representation of a number whenever we talk about applying a bitwise operator. (Note that this doesn’t mean we can’t ever change only one bit at a time; it just means we have to be smart about how we do it.) Understanding what it means to apply a bitwise operator to an entire string of bits is probably easiest to see with the shifting operators. By convention, in C and C++ you can think about binary numbers as starting with the most significant bit to the left (i.e., 10000000 is 128, and 00000001 is 1). Regardless of underlying representation, you may treat this as true. As a consequence, the results of the left and right shift operators are not implementation dependent for unsigned numbers (for signed numbers, the right shift operator is implementation defined).

The leftshift operator is the equivalent of moving all the bits of a number a specified number of places to the left:

[variable]<<[number of places]

For instance, consider the number 8 written in binary 00001000. If we wanted to shift it to the left 2 places, we’d end up with 00100000; everything is moved to the left two places, and zeros are added as padding. This is the number 32 — in fact, left shifting is the equivalent of multiplying by a power of two.

int mult_by_pow_2(int number, int power)
{
return number<<power;
}

Note that in this example, we’re using integers, which are either 2 or 4 bytes, and that the operation gets applied to the entire sequence of 16 or 32 bits.

But what happens if we shift a number like 128 and we’re only storing it in a single byte: 10000000? Well, 128 * 2 = 256, and we can’t even store a number that big in a byte, so it shouldn’t be surprising that the result is 00000000.

It shouldn’t surprise you that there’s a corresponding right-shift operator: >> (especially considering that I mentioned it earlier). Note that a bitwise right-shift will be the equivalent of integer division by 2.

Why is it integer division? Consider the number 5, in binary, 00000101. 5/2 is 2.5, but if you are performing integer division, 5/2 is 2. When you perform a right shift by one: (unsigned int)5>>1, you end up with 00000010, as the rightmost 1 gets shifted off the end; this is the representation of the number 2. Note that this only holds true for unsigned integers; otherwise, we are not guaranteed that the padding bits will be all 0s.

Generally, using the left and right shift operators will result in significantly faster code than calculating and then multiplying by a power of two. The shift operators will also be useful later when we look at how to manipulating individual bits.

For now, let’s look at some of the other binary operators to see what they can do for us.

Bitwise AND

The bitwise AND operator is a single ampersand: &. A handy mnemonic is that the small version of the boolean AND, &&, works on smaller pieces (bits instead of bytes, chars, integers, etc). In essence, a binary AND simply takes the logical AND of the bits in each position of a number in binary form.

For instance, working with a byte (the char type):

01001000 & 
10111000 =
--------
00001000

The most significant bit of the first number is 0, so we know the most significant bit of the result must be 0; in the second most significant bit, the bit of second number is zero, so we have the same result. The only time where both bits are 1, which is the only time the result will be 1, is the fifth bit from the left. Consequently,

72 & 184 = 8

Bitwise OR

Bitwise OR works almost exactly the same way as bitwise AND. The only difference is that only one of the two bits needs to be a 1 for that position’s bit in the result to be 1. (If both bits are a 1, the result will also have a 1 in that position.) The symbol is a pipe: |. Again, this is similar to boolean logical operator, which is ||.

01001000 | 
10111000 =
--------
11111000

and consequently

72 | 184 = 248

Let’s take a look at an example of when you could use just these four operators to do something potentially useful. Let’s say that you wanted to keep track of certain boolean attributes about something — for instance, you might have eight cars (!) and want to keep track of which are in use. Let’s assign each of the cars a number from 0 to 7.

Since we have eight items, all we really need is a single byte, and we’ll use each of its eight bits to indicate whether or not a car is in use. To do this, we’ll declare a char called in_use, and set it to zero. (We’ll assume that none of the cars are initially “in use”.)

char in_use = 0;

Now, how can we check to make sure that a particular car is free before we try to use it? Well, we need to isolate the one bit that corresponds to that car. The strategy is simple: use bitwise operators to ensure every bit of the result is zero except, possibly, for the bit we want to extract.

Consider trying to extract the fifth bit from the right of a number: XX?XXXXX We want to know what the question mark is, and we aren’t concerned about the Xs. We’d like to be sure that the X bits don’t interfere with our result, so we probably need to use a bitwise AND of some kind to make sure they are all zeros. What about the question mark? If it’s a 1, and we take the bitwise AND of XX?XXXXX and 00100000, then the result will be 00100000:

XX1XXXXX & 
00100000 =
--------
00100000

Whereas, if it’s a zero, then the result will be 00000000:

XX0XXXXX & 
00100000 =
--------
00000000

So we get a non-zero number if, and only if, the bit we’re interested in is a 1.

This procedure works for finding the bit in the nth position. The only thing left to do is to create a number with only the one bit in the correct position turned on. These are just powers of two, so one approach might be to do something like:

int is_in_use(int car_num)
{
// pow returns an int, but in_use will also be promoted to an int
// so it doesn't have any effect; we can think of this as an operation
// between chars
return in_use & pow(2, car_num);
}

While this function works, it can be confusing. It obscures the fact that what we want to do is shift a bit over a certain number of places, so that we have a number like 00100000 — a couple of zeros, a one, and some more zeros. (The one could also be first or last — 10000000 or 00000001.)

We can use a bitwise leftshift to accomplish this, and it’ll be much faster to boot. If we start with the number 1, we are guaranteed to have only a single bit, and we know it’s to the far-right. We’ll keep in mind that car 0 will have its data stored in the rightmost bit, and car 7 will be the leftmost.

int is_in_use(int car_num)
{
return in_use & 1<<car_num;
}

Note that shifting by zero places is a legal operation — we’ll just get back the same number we started with.

All we can do right now is check whether a car is in use; we can’t actually set the in-use bit for it. There are two cases to consider: indicating a car is in use, and removing a car from use. In one case, we need to turn a bit on, and in the other, turn a bit off.

Let’s tackle the problem of turning the bit on. What does this suggest we should do? If we have a bit set to zero, the only way we know right now to set it to 1 is to do a bitwise OR. Conveniently, if we perform a bitwise OR with only a single bit set to 1 (the rest are 0), then we won’t affect the rest of the number because anything ORed with zero remains the same (1 OR 0 is 1, and 0 OR 0 is 0).

Again we need to move a single bit into the correct position: void set_in_use(int car_num) { in_use = in_use | 1<<car_num; } What does this do? Take the case of setting the rightmost bit to 1: we have some number 0XXXXXXX | 10000000; the result, 1XXXXXXX. The shift is the same as before; the only difference is the operator and that we store the result.

Setting a car to be no longer in use is a bit more complicated. For that, we’ll need another operator.

The Bitwise Complement

The bitwise complement operator, the tilde, ~, flips every bit. A useful way to remember this is that the tilde is sometimes called a twiddle, and the bitwise complement twiddles every bit: if you have a 1, it’s a 0, and if you have a 0, it’s a 1.

This turns out to be a great way of finding the largest possible value for an unsigned number:

unsigned int max = ~0;

0, of course, is all 0s: 00000000 00000000. Once we twiddle 0, we get all 1s: 11111111 11111111. Since max is an unsigned int, we don’t have to worry about sign bits or twos complement. We know that all 1s is the largest possible number.

Note that ~ and ! cannot be used interchangeably. When you take the logical NOT of a non-zero number, you get 0 (FALSE). However, when you twiddle a non-zero number, the only time you’ll get 0 is when every bit is turned on. (This non-equivalence principle holds true for bitwise AND too, unless you know that you are using strictly the numbers 1 and 0. For bitwise OR, to be certain that it would be equivalent, you’d need to make sure that the underlying representation of 0 is all zeros to use it interchangeably. But don’t do that! It’ll make your code harder to understand.)

Now that we have a way of flipping bits, we can start thinking about how to turn off a single bit. We know that we want to leave other bits unaffected, but that if we have a 1 in the given position, we want it to be a 0. Take some time to think about how to do this before reading further.

We need to come up with a sequence of operations that leaves 1s and 0s in the non-target position unaffected; before, we used a bitwise OR, but we can also use a bitwise AND. 1 AND 1 is 1, and 0 AND 1 is 0. Now, to turn off a bit, we just need to AND it with 0: 1 AND 0 is 0. So if we want to indicate that car 2 is no longer in use, we want to take the bitwise AND of XXXXX1XX with 11111011.

How can we get that number? This is where the ability to take the complement of a number comes in handy: we already know how to turn a single bit on. If we turn one bit on and take the complement of the number, we get every bit on except that bit:

~(1<<position)

Now that we have this, we can just take the bitwise AND of this with the current field of cars, and the only bit we’ll change is the one of the car_num we’re interested in.

int set_unused(int car_num)
{
in_use = in_use & ~(1<<position);
}

You might be thinking to yourself, but this is kind of clunky. We actually need to know whether a car is in use or not (if the bit is on or off) before we can know which function to call. While this isn’t necessarily a bad thing, it means that we do need to know a little bit about what’s going on. There is an easier way, but first we need the last bitwise operator: exclusive-or.

Bitwise Exclusive-Or (XOR)

There is no boolean operator counterpart to bitwise exclusive-or, but there is a simple explanation. The exclusive-or operation takes two inputs and returns a 1 if either one or the other of the inputs is a 1, but not if both are. That is, if both inputs are 1 or both inputs are 0, it returns 0. Bitwise exclusive-or, with the operator of a carrot, ^, performs the exclusive-or operation on each pair of bits. Exclusive-or is commonly abbreviated XOR.

For instance, if you have two numbers represented in binary as 10101010 and 01110010 then taking the bitwise XOR results in 11011000. It’s easier to see this if the bits are lined up correctly:

01110010 ^
10101010
--------
11011000

You can think of XOR in the following way: you have some bit, either 1 or 0, that we’ll call A. When you take A XOR 0, then you always get A back: if A is 1, you get 1, and if A is 0, you get 0. On the other hand, when you take A XOR 1, you flip A. If A is 0, you get 1; if A is 1, you get 0.

So you can think of the XOR operation as a sort of selective twiddle: if you apply XOR to two numbers, one of which is all 1s, you get the equivalent of a twiddle.

Additionally, if you apply the XOR operation twice — say you have a bit, A, and another bit B, and you set C equal to A XOR B, and then take C XOR B: you get A XOR B XOR B, which essentially either flips every bit of A twice, or never flips the bit, so you just get back A. (You can also think of B XOR B as cancelling out.) As an exercise, can you think of a way to use this to exchange two integer variables without a temporary variable?

How does that help us? Well, remember the first principle: XORing a bit with 0 results in the same bit. So what we’d really like to be able to do is just call one function that flips the bit of the car we’re interested in — it doesn’t matter if it’s being turned on or turned off — and leaves the rest of the bits unchanged.

This sounds an awful lot like the what we’ve done in the past; in fact, we only need to make one change to our function to turn a bit on. Instead of using a bitwise OR, we use a bitwise XOR. This leaves everything unchanged, but flips the bit instead of always turning it on:

void flip_use_state(int car_num)
{
in_use = in_use ^ 1<<car_num;
}

When should you use bitwise operators?

Bitwise operators are good for saving space — but many times, space is hardly an issue. And one problem with working at the level of the individual bits is that if you decide you need more space or want to save some time — for instance, if we needed to store information about 9 cars instead of 8 — then you might have to redesign large portions of your program. On the other hand, sometimes you can use bitwise operators to cleverly remove dependencies, such as by using ~0 to find the largest possible integer. And bit shifting to multiply by two is a fairly common operation, so it doesn’t affect readability in the way that advanced use of bit manipulation can in some cases (for instance, using XOR to switch the values stored in two variables).

There are also times when you need to use bitwise operators: if you’re working with compression or some forms of encryption, or if you’re working on a system that expects bit fields to be used to store boolean attributes.

Summary

You should now be familiar with six bitwise operators:

Works on bits for left argument, takes an integer as a second argument

bit_arg<<shift_arg

Shifts bits to of bit_arg shift_arg places to the left — equivalent to multiplication by 2^shift_arg

bit_arg>>shift_arg

Shifts bits to of bit_arg shift_arg places to the right — equivalent to integer division by 2^shift_arg

Works on the bits of both arguments

left_arg & right_arg

Takes the bitwise AND of left_arg and right_arg

left_arg ^ right_arg

Takes the bitwise XOR of left_arg and right_arg

left_arg | right_arg

Works on the bits of only argument

~arg

Reverses the bits of arg

Understanding Different Base Systems


This essay is targeted at new students of computer programming or computer science who want to understand how base two (binary), base eight (octal), and base sixteen (hexadecimal) work.

GA_googleFillSlot(“RunOfSite_LargeRectangle_ATF_336x280”);

First of all, it’s important to realize that each of these base systems is just another way of writing down the same number. When you convert a number between different bases, it should still have the same value. In this essay, when I want to refer to the actual value of a number (regardless of its base), I’ll do it in base 10 because that’s what most people are used to.

It’s generally easiest to understand the concept of different bases by looking at base 10. When we have a number in base 10, each digit can be referred to as the ones digit, tens digit, the hundreds digit, the thousands digit, or so forth. For instance, in the number 432, 4 is the hundreds digit, 3 is the tens digit, and 2 is the ones digit.

Another way to think about this is to rewrite 432 as

  4 x 102 
+ 3 x 101
+ 2 x 100

Each digit is multiplied by the next power of ten. Numbers in other bases, such as base 16, are merely numbers where the base is not ten! For instance, we could interpret 432 as though it were in base 16 by evaluating it as

  4 x 162 
+ 3 x 161
+ 2 x 100

This would be the same as the number 1074 in base 10.

So to convert a number from a given base into base 10, all we need to do is treat each place as a power of the given base times the value of the digit in that place. Note that customarily for a given base, only digits from 0 to the base minus one are used. For instance, in decimal, we only use the digits 0 through 9. That’s because we don’t need any more digits to express every possible number. (But we do need at least that many; if we only had 8 digits, how would we ever express the value 9?)

Now, bases greater than 10 will require more than 10 possible digits. For instance, the number 11 in base ten can be expressed in base 16 with only a single digit because the ones place in base 16 can range from 0 to 15. Since we only have 10 digits, the letters A through F are used to stand for the “digits” 10 through 15. So, for instance, the hexadecimal number B stands for the decimal number 11.

Bases less than ten will require fewer digits–for instance, binary, which works using powers of two, only needs two digits: one and zero. The binary number 1001, for instance, is the same as writing

 
1 * 23
0 * 22
0 * 21
1 * 20

which comes out to the decimal value 9.

Numbers written in octal use a base of 8 instead of 2 or 16. See if you can figure out what the number 20 written in octal would be in base ten.

Because octal, hexadecimal, and decimal numbers can often share the same digits, there needs to be some way of distinguishing between them. Traditionally, octal numbers are written with a leading 0; for instance, 020 is the same thing as the number 20 in base 8. Hexadecimal numbers are written with the prefix of “0x”. So 0x20 would be the number 20 in base 16; we’d interpret it the same as the decimal number 32.

Converting from decimal to octal or hexadecimal

It turns out that when you wish to convert from decimal to octal or hexadecimal, there is a very easy formula that you can use. I’ll give you the one for octal, and let you puzzle out the hexadecimal version (which may come quite naturally to some of you).

To convert from octal to hexadecimal, all you need to do is group the binary digits into pairs of three and convert each one into the corresponding octal number. For instance, given the binary number 010011110, you would group 011 and 110 together. 010 is 2, 011 is 3 and 110 is 6. So the octal number is 0236.

So why exactly does this work? Well, let’s take a look at what 010011110 looks like:

 
0 * 28

1 * 27
0 * 26
0 * 25
+ 1 * 24
+ 1 * 23

+ 1 * 22
+ 1 * 21
+ 0 * 20

That’s actually the same as

 
0 * 22 * 26
+ 1 * 21 * 26
+ 0 * 20 * 26

+ 0 * 22 * 23
+ 1 * 21 * 23
+ 1 * 20 * 23

+ 1 * 22 * 20
+ 1 * 21 * 20
+ 0 * 20 * 20

Whoa! First, notice that the far right column is actually turning into powers of 8! 23 is 8, and 26 is 64! So this means for each group of three digits, we have the base increasing by a factor of 8. Moreover, look at the two columns on the left. It can sum up to at most 7 (since 20 + 21 + 22= 1 + 2 + 4 and the binary digit just decides whether each power of two is included into the sum or not). That’s exactly the same as having eight digits, 0 through 7, and once we sum them all together, we multiply the sum by a power of eight. That’s just the same as making each group of three binary digits an octal digit!

Integer to english string program in c++


#include <string.h>
#include <iostream.h>

using namespace std;

string num_to_text[] = { "", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen" };

string tens_to_text[] = { "", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety" };

string power_to_text[] = { "", "thousand", "million", "billion" };

string padIfNeeded (string ans)
{
if ( ans == "" )
{
return "";
}
return " " + ans;
}

string translateHundred (int hundred_chunk)
{
// handle special cases in the teens
if ( hundred_chunk < 20 ) {
return num_to_text[ hundred_chunk ];
}
int tens = hundred_chunk / 10;
int ones = hundred_chunk % 10;
return tens_to_text[ tens ] + padIfNeeded( num_to_text[ ones ] );
}


string translateThousand (int thousand_chunk)
{
if ( thousand_chunk < 100 )
{
return translateHundred( thousand_chunk );
}
else
{
int hundreds = thousand_chunk / 100;
int hundred_chunk = thousand_chunk % 100;
return num_to_text[ hundreds ] + " hundred" + padIfNeeded( translateHundred( hundred_chunk ) );
}
}


int main()
{
int n;
cin >> n;
string number;
bool is_negative = false;
if ( n < 0 )
{
is_negative = true;
n *= -1;
}

int chunk_count = 0;
while ( n > 0 )
{
if ( n % 1000 != 0 ) {
number = translateThousand( n % 1000 ) + padIfNeeded( power_to_text[ chunk_count ] + padIfNeeded( number ) );
}
n /= 1000;
chunk_count++;
}
if ( number == "" )
{
number = "zero";
}
if ( is_negative )
{
number = "negative " + number;
}
cout >> number >> endl;
}

What’s the difference between… > static and extern?


The extern and static (1) keywords seem to cause more confusion than they should. They are both storage-class specifiers, more specifically linkage specifiers. Linkage is a concept that tells the compiler what to do when two files declare an object or function with the same name. Of course, this only applies to “file level” identifiers, i.e. functions and global variables. It does not apply to function parameters or local function variables, since those are only visible from within that function.
The extern keyword specifies external linkage. It tells the compiler that the object or function declared here is actually defined in another file. It helps here to explain the difference between definition and declaration. Definition is where the variable is created and space is set aside for it. For functions, the definition is the function body. Declaration is simply stating the linkage and type of the variable, so the compiler can ensure you’re using it correctly. For functions, the declaration states the linkage, return type, parameter count and parameter types. Go here for more on the difference between declare and define in C and C++.
The static keyword is somewhat the opposite of extern. It tells the compiler that the object or function declared is internally linked, and only visible from within that translation unit (a translation unit is a technical term for a .c file after all the preprocessing is finished and the #includes added in).
extern should be used in a header file to make functions available to other .c files that wish to use that interface. This makes them akin to public members in a C++ class. static should be used in the .c file with the variable and function definitions, to hide functions and variables from other .c files. This is akin to private members in a C++ class.
As you can tell, it makes no sense to have something declared as static and again as extern. Doing so results in undefined behavior, so don’t do it!
For an example, see the Multiple Source Files FAQ entry.
Footnote:
(1) static seems to be a poorly chosen keyword for specifying linkage, and has confused many, if not all C programmers at some point (it’s choice for describing storage duration, however, is excellent). There is nothing particularly static about something with internal (static) linkage. They don’t stay in one place any more than their non-static counterparts. A more appropriate choice might have been intern(al) or private, but it’s too late now.

Multiple source files for one program (C++ example)


Let’s say you have just written a program that used a function that you would like to use in another program and a class with which you would desire to do likewise. What options do you have? You could rewrite the code each time you want to use it in another program, or you could create your own header (and cpp) file to hold the information and then list it in the list of included header files of your next program. This entry describes that process. Usually you put the descriptive part of the function/class in the header file, and the interesting/working part in the cpp file (the exception being templated class/functions that need all information in the header file). Although not mandatory, it is recommended that you use macro guards with the header files. Here’s a simplistic example:

/*
* Program1.cpp
*/


//all in one program. Want to "save" class and function for reuse.
#include <iostream.h>

class MyClass
{
public:
//using default default constructor
//using default copy constructor
//using default assignment operator
//using default destructor
int getData(void);
void setData(int);

private:
int data;
};

int MyClass::getData(void)
{
return(data);
}

void MyClass::setData(int input)
{
data = input;
}

void myFunc(int &);

int main(void)
{
int num = 3;

myFunc(num);
cout << num << endl;

MyClass sample;
sample.setData(num);
cout << sample.getData();

return(0);
}

void myFunc(int &Num)
{
Num += 10;
}

To break Program1 up into reuseable segments do this:For the CLASS

/*
* The header file: Data.h
*/


#ifndef DATA_H
#define DATA_H
//put just the descriptive part of the class in here
class MyClass
{
public:
//using default default constructor
//using default copy constructor
//using default assignment operator
//using default destructor
int getData();
void setData(int);

private:
int data;
};
#endif

For the associated cpp file for the CLASS

/*
* Data.cpp
*/


#include "Data.h"

//put interesting/working part of class in here
int MyClass::getData()
{
return data;
}

void MyClass::setData(int input)
{
data = input;
}

For the function

/*
* The header file: MyFunctions.h
*/


#ifndef MYFUNCTIONS_H
#define MYFUNCTIONS_H
//list descriptive part of function--usually just the prototype(s)
void myFunc(int &);
//any other functions that you might want to include could go here too.
#endif

For the associated cpp file for the function

/*
* The associated cpp file for the class: MyFunctions.cpp
*/


#include "MyFunctions.h"
//here you put the working part of the function(s)
//usually the definitions
void myFunc(int &Num)
{
Num += 10;
}

Now put the program back together using the modules

/*
* Program2.cpp
*/


//driver program for MyClass and MyFunctions
#include <iostream.h>

//include your header files this time, not the actual code as before
//Note: do NOT include the cpp files, don't even mention them
//the linker will link the necessary cpp files for you (like magic!)
#include "Data.h"
#include "MyFunctions.h"

int main(void)
{
int num = 3;
myFunc(num);
cout << num << endl;

MyClass sample;
sample.setData(num);
cout << sample.getData();
return(0);
}

And here’s the output from the compiler:

C:\>bcc32 -emyprog.exe myfunctions.cpp program2.cpp data.cpp
Borland C++ 5.5 for Win32 Copyright (c) 1993, 2000 Borland
myfunctions.cpp:
program2.cpp:
data.cpp:
Turbo Incremental Link 5.00 Copyright (c) 1997, 2000 Borland

C:\>myprog
13
13

For most header files you create on your computer with your compiler this should work just fine. But let’s say you copy your header file and associated cpp file to a floppy to give to a friend or you download the above MyClass files to your computer from a shareware site. If the header file and cpp file you are trying to use is not in the same subdirectory/directory/folder as the program you are trying to link to, then you should use the angled brackets like you do for iostream (as iostream.h is usually in a different directory/subdirectory from the program you are working on).