[vox-tech] C - passing chars and pointer to chars

Ken Bloom kbloom at gmail.com
Sat Jun 3 20:27:37 PDT 2006


On Friday 02 June 2006 10:31, Peter Jay Salzman wrote:
> I've been learning Java's JNI API, and came across something about C
> that I never knew.
>
> There are 3 types of char:
>
>    char
>    signed char
>    unsigned char
>
> My understanding of the standard says that char can either be of type
> "signed char" or "unsigned char"; it's implementation specific.  By
> assigning "c = 255" I found that on my own platform (GNU/x86) a char
> is implemented as a "signed char".  I think I remember reading that
> on Apple platforms, it's implemented as an "unsigned char".
>
> According to the gcc info page:
>
>      Ideally, a portable program should always use `signed char' or
>      `unsigned char' when it depends on the signedness of an object.
>      But many programs have been written to use plain `char' and
> expect it to be signed, or expect it to be unsigned, depending on the
> machines they were written for.  This option, and its inverse, let
> you make such a program work with the opposite default.
>
> *    The type `char' is always a distinct type from each of `signed
> *    char' or `unsigned char', even though its behavior is always
> just *    like one of those two.
>
> char is a *distinct type* from "signed char" or "unsigned char". 
> That surprised me.  So I did some experimentation and here's what I
> found.
>
> Apparently, there's no problem assigning the different chars to each
> other. The compiler does the automatic conversion:
>
>    char          a = 0;
>    signed char   b = 0;
>    unsigned char c = 0;
>
>    a = b; a = c;    // fine.
>    b = a; b = c;    // fine.
>    c = a; c = b;    // fine.
>
> You can even pass the different types of char to functions that take
> other types of char:
>
>    void takesAChar( char x, signed char y, unsigned char z );
>
>    takesAChar(a, b, c); takesAChar(a, c, b);  // fine.
>    takesAChar(b, a, c); takesAChar(b, c, a);  // fine.
>    takesAChar(c, b, a); takesAChar(c, a, b);  // fine.
>
> What the compiler complains about is passing *pointers* to different
> types of char:
>
>    void takesACharPtr( char *x, signed char *y, unsigned char *z );
>
>    takesACharPtr(&a, &b, &c); takesACharPtr(&a, &c, &b);  //
> warnings. takesACharPtr(&b, &a, &c); takesACharPtr(&b, &c, &a);  //
> warnings. takesACharPtr(&c, &a, &b); takesACharPtr(&c, &b, &a);  //
> warnings.
>
> The warning is:
>
>    pointer targets in passing argument foo of bar differ in
> signedness.
>
> I'm trying to understand this.  I'm fairly sure the standard says
> that all 3 types of char must have the same width.  For pointer
> operations like:
>
>    char s[] = "hello";
>    unsigned char *cptr = s;
>    ++cptr;
>    putc( *cptr, stdout );
>
> will correctly print "e" because "char" and "unsigned char" have the
> same width, and when we add one to cptr, it points to the correct
> location in memory.
>
> What I'm getting at is this.  Because all the chars have the same
> width, it doesn't matter WHAT kind of pointer you pass in to a
> function: char, signed char, or unsigned char.  Pointer arithmetic
> just works, and it works because they all have the same width.
>
> On the other hand, the data is what gets mangled if you don't use the
> correct type:
>
>    char c = 255;
>    printf("%d", c);
>
> prints, as expected, -1.  Not 255.
>
> So it seems to me that if the compiler complains about anything, it
> should complain about passing a different type of char, not a
> different type of char *.
>
> Why does gcc 4 complain about passing different "char *" and not
> "char"?
>
> And is this because of the standard or is it gcc specific?

Cue, the **Fundemental axiom of the C++ type system**, stated as 
follows:
  A* is automaitcally convertable to B* if and only if A is a B.
  (Likewise for pass by reference).

(this is my own generalization though, and there may actually be 
exceptions)

When handling inheritance, if Derived is a Base (Derived inherits from 
Base), then Derived* can be automatically converted to Base*. But a 
Derived* is not a Base*, so a Derived** cannot be automatically 
converted to a Base**.

When dealing with templates, you cannot pass vector<Derived> where a 
vector<Base> is expected, neither by reference nor by pointer, because 
vector<Derived> is not a vector<Base> (because if you were to stick a 
new Base into the vector, then it would violate the type of 
vector<Derived>).

Supposing you wanted to create a new reference_counted_pointer<T>. A 
reference_counted_pointer<Derived> is not a 
reference_counted_pointer<Base>, and cannot be used as such, but you 
would want to implement all of the appropriate conversions when writing 
reference_counted_pointer<T> to mimic the semantics of an ordinary 
pointer.

signed char is not an unsigned char, but they are convertable. However, 
signed char * is not convertable to unsigned char *, and to force such 
a conversion, you would use a reinterpret_cast<> (which reinterprets 
the actual bits according to a different time), or as it seems from 
Bill's post, a static_cast<> (which is generally safer when it's 
allowed).

--Ken Bloom

-- 
I usually have a GPG digital signature included as an attachment.
See http://www.gnupg.org/ for info about these digital signatures.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ns1.livepenguin.com/pipermail/vox-tech/attachments/20060603/e35360b8/attachment.pgp


More information about the vox-tech mailing list