[vox-tech] C - passing chars and pointer to chars
Ken Bloom
kbloom at gmail.com
Sat Jun 3 20:27:37 PDT 2006
On Friday 02 June 2006 10:31, Peter Jay Salzman wrote:
> I've been learning Java's JNI API, and came across something about C
> that I never knew.
>
> There are 3 types of char:
>
> char
> signed char
> unsigned char
>
> My understanding of the standard says that char can either be of type
> "signed char" or "unsigned char"; it's implementation specific. By
> assigning "c = 255" I found that on my own platform (GNU/x86) a char
> is implemented as a "signed char". I think I remember reading that
> on Apple platforms, it's implemented as an "unsigned char".
>
> According to the gcc info page:
>
> Ideally, a portable program should always use `signed char' or
> `unsigned char' when it depends on the signedness of an object.
> But many programs have been written to use plain `char' and
> expect it to be signed, or expect it to be unsigned, depending on the
> machines they were written for. This option, and its inverse, let
> you make such a program work with the opposite default.
>
> * The type `char' is always a distinct type from each of `signed
> * char' or `unsigned char', even though its behavior is always
> just * like one of those two.
>
> char is a *distinct type* from "signed char" or "unsigned char".
> That surprised me. So I did some experimentation and here's what I
> found.
>
> Apparently, there's no problem assigning the different chars to each
> other. The compiler does the automatic conversion:
>
> char a = 0;
> signed char b = 0;
> unsigned char c = 0;
>
> a = b; a = c; // fine.
> b = a; b = c; // fine.
> c = a; c = b; // fine.
>
> You can even pass the different types of char to functions that take
> other types of char:
>
> void takesAChar( char x, signed char y, unsigned char z );
>
> takesAChar(a, b, c); takesAChar(a, c, b); // fine.
> takesAChar(b, a, c); takesAChar(b, c, a); // fine.
> takesAChar(c, b, a); takesAChar(c, a, b); // fine.
>
> What the compiler complains about is passing *pointers* to different
> types of char:
>
> void takesACharPtr( char *x, signed char *y, unsigned char *z );
>
> takesACharPtr(&a, &b, &c); takesACharPtr(&a, &c, &b); //
> warnings. takesACharPtr(&b, &a, &c); takesACharPtr(&b, &c, &a); //
> warnings. takesACharPtr(&c, &a, &b); takesACharPtr(&c, &b, &a); //
> warnings.
>
> The warning is:
>
> pointer targets in passing argument foo of bar differ in
> signedness.
>
> I'm trying to understand this. I'm fairly sure the standard says
> that all 3 types of char must have the same width. For pointer
> operations like:
>
> char s[] = "hello";
> unsigned char *cptr = s;
> ++cptr;
> putc( *cptr, stdout );
>
> will correctly print "e" because "char" and "unsigned char" have the
> same width, and when we add one to cptr, it points to the correct
> location in memory.
>
> What I'm getting at is this. Because all the chars have the same
> width, it doesn't matter WHAT kind of pointer you pass in to a
> function: char, signed char, or unsigned char. Pointer arithmetic
> just works, and it works because they all have the same width.
>
> On the other hand, the data is what gets mangled if you don't use the
> correct type:
>
> char c = 255;
> printf("%d", c);
>
> prints, as expected, -1. Not 255.
>
> So it seems to me that if the compiler complains about anything, it
> should complain about passing a different type of char, not a
> different type of char *.
>
> Why does gcc 4 complain about passing different "char *" and not
> "char"?
>
> And is this because of the standard or is it gcc specific?
Cue, the **Fundemental axiom of the C++ type system**, stated as
follows:
A* is automaitcally convertable to B* if and only if A is a B.
(Likewise for pass by reference).
(this is my own generalization though, and there may actually be
exceptions)
When handling inheritance, if Derived is a Base (Derived inherits from
Base), then Derived* can be automatically converted to Base*. But a
Derived* is not a Base*, so a Derived** cannot be automatically
converted to a Base**.
When dealing with templates, you cannot pass vector<Derived> where a
vector<Base> is expected, neither by reference nor by pointer, because
vector<Derived> is not a vector<Base> (because if you were to stick a
new Base into the vector, then it would violate the type of
vector<Derived>).
Supposing you wanted to create a new reference_counted_pointer<T>. A
reference_counted_pointer<Derived> is not a
reference_counted_pointer<Base>, and cannot be used as such, but you
would want to implement all of the appropriate conversions when writing
reference_counted_pointer<T> to mimic the semantics of an ordinary
pointer.
signed char is not an unsigned char, but they are convertable. However,
signed char * is not convertable to unsigned char *, and to force such
a conversion, you would use a reinterpret_cast<> (which reinterprets
the actual bits according to a different time), or as it seems from
Bill's post, a static_cast<> (which is generally safer when it's
allowed).
--Ken Bloom
--
I usually have a GPG digital signature included as an attachment.
See http://www.gnupg.org/ for info about these digital signatures.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ns1.livepenguin.com/pipermail/vox-tech/attachments/20060603/e35360b8/attachment.pgp
More information about the vox-tech
mailing list