- any new language derived from C but without its ancient and/or modern warts, should be called 'P', and one after that 'L': BCPL -> B -> c -> P -> L - ESR says: "I have a friend working on a language he calls 'Cx' which is C with minimal changes for type safety; the goal of his project is explicitly to produce a code lifter that, with minimal human assistance, can pull up legacy C codebases. I won't name him so he doesn't get stuck in a situation where he might be overpromising, but the approach looks sound to me and I'm trying to get him more funding." Hmmm..... very very interesting! It is likely Drew Devault: https://drewdevault.com/ I.e. "Hare": https://harelang.org/ - Zig -- http://ziglang.org/ (by Andrew Kelley) ready and running, apparently. BUT oh, gawd -- it's currently written in C++ and uses CMake (though apparently comes with its own built-in build system so that Zig programs don't need anything like CMake, etc.) - Jai: imperative static/strongly typed C-style language by Jonathan Loom https://inductive.no/jai/ https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md No implicit type conversions No header files Beautiful and elegant. Built-in build system. Jai code can be built on top of existing C libraries - need to know more about its runtime - V -- https://vlang.io Perhaps the most mature of the lot. - look at http://c0.typesafety.net/ too... - Check out C-Rusted https://arxiv.org/abs/2302.05331 - NO MORE "undefined behaviour"!!! Pick something sane and stick to it! "Undefined behaviour" is Evil, by definition. As we were told by C.A.R. Hoare, "Premature optimization is the root of all evil in programming." Undefined behaviour permits, and is being exploited by compilers to, optimize all code as soon as possible and in unexpected ways. Therefore Undefined Behaviour is Evil. Q.E.D. The problem with modern "Standard" C is that instead of refining the definition of the abstract machine to match the most common and/or logical behaviour of existing implementations, or making illogical operations such as dereferencing a nil pointer into "implementation defined" operations, the standards committee chose to throw the baby out with the bath water and make whole swaths of conditions into so-called "undefined behaviour" conditions. An excellent example are the data-flow optimizations that are now commonly abused to sometimes elide security/safety-sensitive code: int foo(struct bar *p) { char *lp = p->s; if (p == NULL || lp == NULL) { return -1; } lp[0] = '\0'; return 0; } Any programmer worth their salt will assume the compiler can calculate the offset of 's' at compile time and thus anyone ignorant of C's new "undefined behaviour" rules will guess that at worst some location on the stack will be assigned a value pulled from low memory (if that doesn't cause a SIGSEGV). The problem is that the de-reference of 'p' is "hidden" from all but the most pedantic reader on first glance, and on second glance it is all to easy to assume that the dereference may not happen right away because we might assume that any optimizer worth it's salt SHOULD defer it until the first use of 'lp', perhaps not even allocating any stack space for 'lp' at all -- i.e. replacing all occurrences of 'lp' with 'p->s'. Think now if there was a page of unrelated code between the declaration and definition and initialisation of 'lp' and the tests of 'p' and 'lp', and that the initialisation was moved without proper review to the declaration by some more junior programmer. Now even an expert reader will struggle to spot the UB. Another example: uint16_t mul(uint16_t a, uint16_t b) { return a * b; } The "usual arithmetic conversions" will apply before the multiplication, and the unsigned shorts will be converted to (signed) ints, so if int is larger than short, but not large enough to hold the product of two unsigned shorts, then there can be overflow, and signed integer overflow is UB. The fix would have to be: uint16_t mul(uint16_t a, uint16_t b) { unsigned int aa = a; unsigned int bb = b; return aa * bb; } Worse yet this example stems from actual Linux kernel code like this: static int podhd_try_init(struct usb_interface *interface, struct usb_line6_podhd *podhd) { struct usb_line6 *line6 = &podhd->line6; if ((interface == NULL) || (podhd == NULL)) return ENODEV; .... } Here some language-lawyer-wannabees might try in vain to argue over the interpretation of "dereferencing", yet again any C programmer worth their salt knows that the address of an field in a struct is simply the sum of the struct's base address and the offset of the field, the latter of which the compiler obviously knows at compile time, and so adding a value to a NULL pointer should never EVER be considered invalid or undefined! (I suspect the LLWs are being misled by the congruence between "a->b" and "(*a).b".) Worst of all consider this example: void * foo(struct bar *p) { size_t o = offsetof(p, s); if (s == NULL) return NULL; .... } And then consider an extremely common example of "offsetof()" which might very well appear in a legacy application's own code because it pre-dated , though indeed this very definition has been used in by several standard compiler implementations, and indeed it was specifically allowed in general by ISO C90 (and only more recently denied by C11, sort of): #define offsetof(type, member) ((size_t)(unsigned long)(&((type *)0)->member)) or possibly (for those who know that pointers are not always "just" integers): #define offsetof(type, member) ((size_t)(unsigned long)((&((type *)0)->member) - (type *)0)) Here we have very effectively and entirely hidden the fact that the '->' operator is used with 's'. Any sane person with some understanding of programming languages should agree that it is wrong to assume that calculating the address of an lvalue "evaluates" that lvalue. In C the '->' and '[]' operators are arithmetic operators, not (immediately and on their own) memory access operators. Any sane person who happens to know that the definition of NULL as "0" (i.e. a plain literal integer zero) is valid will also wonder why adding a value to a pointer initialised to zero is any different from assigning that value. Sadly C's new undefined behaviour rules as interpreted by some compiler maintainers (which I label "premature optimiser warriors") now allow the compiler to STUPIDLY assume that since the programmer has knowingly put a supposed de-reference of a pointer on the first line of the function, then any comparisons of that pointer with NULL further on are OBVIOUSLY never ever going to be true and so it can SILENTLY wipe out the whole damn security check. I guess I'm saying that modern compiler maintainers are not sane, and at least some of the more recent C Standards Committee are definitely NOT sane and/or friendly and considerate. Premature optimiser zealots are more interested in optimising away code than they are in the correct and logical operation of code. C's primitive nature engenders the programmer to think in terms of what the target machine is going to do, and as such it is extremely sad and disheartening that the standards committee chose to endanger users in so many ways. It’s not that evaluating something like (1<<32) might have an unpredictable result, but rather that the entire execution of any program that evaluates such an expression is ENTIRELY meaningless! Indeed according to "Standard C" the execution is not even meaningful up to the point where undefined behaviour is encountered. Undefined behaviour trumps ALL other behaviors of the C abstract machine. All in the goal of attempting comprehensive maximum possible optimization of all code at any expense INCLUDING correct operation of the program. Not all so-called "undefined behaviours" are quite this bad, yet, but in general we would be infinitely better off with a more completely defined abstract machine that might force some target architectures to jump through hoops instead of forcing EVERY programmer to ALWAYS be more careful than EVERY conceivable optimizer. As Phil Pennock said: If I program in C, I need to defend against the compiler maintainers. [[ and future standards committee members!!! ]] If I program in Go, the language maintainers defend me from my mistakes. And I say: Modern "Standard C" is actually "Useless C" and "Unusable C" I don't think the C language (in all lower-case, un-quoted, plainly) is the problem -- I think the problem is the wording of the modern standard, and the unfortunate choice to use the phrase "undefined behaviour" for certain things. This has given "license" to premature optimization warriors -- and their over-optimization is the root of the evil I see in current compilers. It is this unfortunate choice of describing things as "undefined" within the language that has made modern "Standard C" unusable (especially for any and all legacy code, which is most of it, right?). If we outlawed the use of the phrase "undefined behaviour" and made all instances of it into "implementation defined behviour", with a very specific caveat that such instances did not, would not, and could not, ever allow optimizers to even think of violating any possible conceivable principle of least astonishment. E.g. in the foo() example above, the only thing allowed would be for the implementation to do as it please IFF and when the pointer passed was actually a nil pointer at runtime (and perhaps in this case with a strong hint that the best and ideal behaviour would be something akin to calling abort()). - where behaviour cannot be defined due to type punning (e.g. with a poorly designed printf() like C's), the language must not allow the compiler ANY free reign. The compiler must produce code that will run and which will do something that an "insider" would consider reasonable for the target platform. - Too bad printf(3) et al don't have format flag akin to '*' which allows you to pass the sizeof() and the signedness of an integer type to it so that you don't have to hard-code the integer size in the format string.... int128_t foo = -1; uint64_t ubar = 1; #define issigned(v) (((typeof(v)) -1 < 0) ? true : false) #define INTPARAM(v) issigned(v), sizeof(v), v printf("%Sd %Sx\n", INTPARAM(foo), INTPARAM(ubar)); Better yet would be some way to avoid the horrid INTPARAM() macro using some kind of built-in language feature, but not going so far as to provide the libc author with full type introspection - it is sometimes quite sad that the same keyword "break" is used both as a statement to escape loops, as well as a statement to "escape" switch blocks.... I really wish it were "endcase" or something like that! Another reserved word wouldn't be the end of the world! - somewhat along the lines of an 'endcase' keyword, Rick Marshall suggests an enhanced loop statement: loop { init: /* include variable definitions scoped to the loop - block */ test: /* loop while test true - expression */ next: /* things to do before next iteration such increment variables - block */ body: /* things to do on each iteration - block */ exit: /* things to do when test fails - block*/ } That sort of implies adding even more new context-sensitive keywords, which is a bit of a mess, so.... Is it really cleaner and more intuitive than this? for (init; test || exit, 0; next) { body; } The only thing it seems to do is allow a block of code for the "init", "next", and "exit" parts where expressions must be used in a "for()" So, NO. - watch out for netsted functions (GCC), Blocks (Apple Clang), and possibly the future of Lambdas (Jens Gustedt & ISO C committee) https://thephd.dev/lambdas-nested-functions-block-expressions-oh-my - sometimes it is useful to be able to use sizeof() to find out directly what sized object a function call will return without having to call that function or know its type, e.g.: sizeof(time()) Is this a GCC-ism? - C99 should have had typeof() and compound statement expressions (with proper scoping!) - char should always be unsigned char -- only use "sbyte" for signed byte values. THERE SHOULD NOT BE ANY "signed char" ALLOWED, EVER! - but maybe 'char' should just be UTF-8 32-bit code-points? - int should always be unsigned as well (and it should be explicitly described so with the "uint" official keyword, and "sint" for signed integers) It could also be called "cint", for counting integer, implying it is unsigned. - C syntax should not allow single statements in places where blocks are also allowed. I.e. even single statements for 'if', 'for', 'do', 'while', etc. must always be blocks (with braces). Prevent Apple's "goto fail" bug from ever happening! https://cacm.acm.org/blogs/blog-cacm/173827-those-who-say-code-does-not-matter/fulltext Perhaps 'switch' statement syntax could also be cleaned up to always use such clear blocks with braces syntax too, though it should probably use a different keyword too, just to avoid confusion. - PL/1 "vectors" have merit as one-dimensional arrays which the compiler can guarantee cannot be accessed out-of-bounds (address of a vector is always a "void *"???) - along the line of "vectors", what about Pascal subrange types, and the idea of requiring variables used to subscript arrays be of a type which limits their values to the size of the array? This can also go so far as to avoid the need for nested loops to iterate over multi-dimensional arrays. - do C99 anonymous unions make it possible to have "const" fields in structs which are unioned with matching non-const fields with a "set_" prefix on their name? Thus you can only set the value if you use the "set_*" name, and presumably you only do that in, say, the main thread. (probably not -- and c11 makes such type-punning even less welcome) This would help solve one of the few places where one might wish to use "const" qualifiers to signal storage ownership over just specified fields to another function when passing a struct (or struct *) parameter. On the other hand explicit use of unions to view an object with different type interpretations should still be OK: https://gustedt.wordpress.com/2016/08/17/effective-types-and-aliasing/ Note that using unions for the purposes of memory reinterpretation (i.e. writing one member and then reading another, aka type-punning) has always been allowed by most C implementations (7th Edition Unix source did this in several places). C99 was initially clumsily worded, appearing to make type-punning through unions undefined. In reality, type-punning though unions is legal in C89, and legal in C11, and it was actually legal in C99 all along, however it took until 2004 for the committee to fix the incorrect wording with the release of C99-TC3. - what would be necessary to add to C to make it possible to pass a limited set of self-identifying objects through a function parameter to a sub-function which only accepts one of that set? I.e. so that we could avoid using "void *" parameters which are passed through to functions selected at runtime? (e.g. as with the compare functions passed to the various C90 sort functions) Perhaps there could be a type called "generic" which would essentially internally be an anonymous union of all possible types combined with a hidden type specifier, and something like C11's _Generic() could be used to determine the type it had been assigned with. (how do pointers to aggregates work with such a thing? and what about whole aggregates?) (Some people might like to call these "generics", but I hate that term.) - avoiding use of "void *" for generic typing could allow a rule that prevents ever coercing a "void *" back into any other type of pointer, thus preventing pointer aliasing entirely? - Positional parameters are evil (or at least error prone), especially for variable numbers of parameters. Tony Hoare suggested in a 1972 paper [Hoare72] that (a) "the type of the result of every operation should be known at compile time from the types of the operands and the identity of the operator"; and (b) "checking the match and consistency of parameters and arguments of a subroutine can and should be accomplished at compile time". C has always done well with the former, albeit with some implied conversions, bit it only finally gained the latter with prototypes in C89. However prototypes don't solve the problem of assigning the correct argument value to the desired parameter when there might be more than one parameter of the same type. [Hoare72] "Prospects for a Better Programming Language" "Infotech State of the Art Report: High Level Languages", Vol. 7, 1972 So, C should have named parameters for functions (and their return values) -- this would also allow enhancements to printf() and scanf() et al to make them more reliable and easier to use correctly. int i = 99; char *s = "a string"; smart_printf(fmt="int=%{i}, string=${s}", i=i, s=s); (This might be done by inventing some special syntax hint which allowed the compiler to translate the function parameters from symbolic form to something like a vector of parameters, and replace the symbols in the format string with indices to the appropriate parameter in the vector, such that a library function could still be used to implement the underlying formatting -- assuming there is also some way for the compiler to include type information meaningful to the called function in the parameters vector.) Hmmmm.... how do we avoid the redundancy (i.e. i=i should be just i) and directly allow the internal symbol name be used in the format string? Perhaps one solution is like that used in C++? -- I.e. printf() and friends should be a language construct, not a function call, then the compiler is in control and knows what the symbols are. BUT!!! Sometimes it seems as if so many things are so much better expressed as function calls -- one would need some way to over-load the underlying I/O mechanism to capture the output, or pass it to a callback handler, or something, in order to build custom formatting "functions". Or just always do the following (and optionally also for return values) as it should take exactly the same amount of stack space (though maybe padding could be turned off for this class of parameter structs): struct somefunc_params { char *p1; int i1; struct fooness foo; }; int somefunc(struct somefunc_params p) { if (p.i1) printf("%s", p.p1); return 0; } res = somefunc((struct somefunc_params) {.p1 = "foo", .i1 = 1, .foo = (struct fooness) {.blah = 4}}); Note that in Go it would seem the ':' is used instead of '=' for such initialisations. The only really ugly parts are the need for the cast and the extra braces, and of course the need to prefix every reference to every parameter with "p.", which is why named parameters would be _nice_. We would of course assume the storage for the struct would effectively be zeroed as with normal struct initialisers. It could be pre-allocated by the toolchain in the initialised data segment. Ada allows defaults for parameters to be declared with the function. We could do it by allowing initialisers in struct declarations: struct somefunc_params { char *p1; int i1 = 4; struct fooness foo; }; and everything else would be zeroed. This could be generally useful. Note that using structures for parameters means entirely foregoing the compiler's ability to verify the correct use of an interface, especially if/when that interface changes in such a way that all callers will require updating to be sure that they conform to the new interface requirements. In a discussion of a proposal for named parameters in Go, it was mentioned that the compiler could "infer, from the function signature, the type of struct literal being passed in". This would at least avoid the need for the cast(s) (note it would work for any struct assignment, as with .foo here). res = somefunc({.p1 = "foo", .i1 = 1, .foo = {.blah = 4}}); Furthermore one could also assume the presence of the '.' operator in front of an identifier implies the fact that named parameters are in use and we can skip at least the outer braces (the struct parameter ".foo" cannot so easily be freed of braces as one needs them to detect nesting and avoid false errors when some struct argument has same-named field tags as the main params struct, e.g. the 'i1' here: res = somefunc(.p1 = "foo", .foo = {.blah = 4, .i1 = 42}, .i1 = 1); Maybe (I've not thought about this enough yet) we could even assume assignments in parameters also infer a struct literal is being passed and so also avoid the dots: res = somefunc(p1 = "foo", foo = {blah = 4, i1 = 42}, i1 = 1); Similarly inside the function the lack of a tag name for a '.' operator could imply the parameter struct, assuming there is only one (which there would be if all calls had implied parameter structs), sort of like having anonymous sub-struct/union fields: int somefunc(struct somefunc_params p) { if (.i1 > 0) printf("%s", .p1); return .i1 - 1; } (BTW, someone else in that discussion about Go complained that one has to declare the struct used for the parameters, but of course one must declare the parameters anyway -- the only additional element required is the need for the unique struct tag name, and the need to express it twice; well and of course having to prefix parameter names with ".".) This may help with implementing "generics" cleaner in C11.... E.g. if the helper macro also uses the C11 "_Generic()" thing to set a corresponding enum or similar for every union parameter. However to make it truly safe though it has to go deeper into the language to avoid risking evaluating arguments more than once (n.b. though the controlling-expression and the expressions of the selections that are not chosen are never evaluated). Also currently qualifiers mess this up badly too and GCC and Clang don't agree (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1930.htm : "gcc treats all such expressions as rvalues and does all applicable conversions of 6.3.2.1, that is lvalue to rvalue and array to pointer conversions; clang treats them as lvalues") See points above for another take on it with a standard "generic" type. To Remind Myself: A "parameter" is a variable (or identifier) used to refer to one of the pieces of data provided as input to a function or subroutine. These pieces of data, generally the result of evaluating an expression at the point of the call, are called "arguments". We can also call the parameters in declarations the "formal parameters", and the values passed in a particular call the "actual parameters". Note: (from http://cacm.acm.org/magazines/2009/11/48444-you-dont-know-jack-about-software-maintenance/fulltext) However, that is not the real meaning of continuous. The real continuous approach comes from Multics, the machine that was never supposed to shut down and that used controlled, transparent change. The developers understood the only constant is change and that migration for hardware, software, and function during system operation is necessary. Therefore, the ability to change was designed from the very beginning. Software in particular must be written to evolve as changes happen, using a weakly typed high-level language and, in older programs, a good macro assembler. No direct references are allowed to anything if they can be avoided. Every data structure is designed for expansion and self-identifying as to version. Every code segment is made self-identifying by the compiler or other construction procedure. Code and data are changeable on a per-command/process/system basis, and as few as possible copies of anything are kept, so single copies could be dynamically updated as necessary. With Multics, the developers did all of these good things, the most important of which was the discipline used with data structures: if an interface took more than one parameter, all the parameters were versioned by placing them in a structure with a version number. The caller set the version, and the recipient checked it. If it was completely obsolete, it was flatly rejected. If it was not quite current, it was processed differently, by being upgraded on input and probably downgraded on return. An elegant modern form of continuous maintenance exists in relational databases: one can always add columns to a relation, and there is a well-known value called NULL that stands for "no data." If the programs that use the database understand that any calculation with a null yields a NULL, then a new column can be added, programs changed to use it over some period of time, and the old column(s) filled with NULLs. Once all the users of the old column are no more, as indicated by the column being NULL for some time, then the old column can be dropped. Note also that named parameters with default values allow for this kind of more implicit API evolution -- or at least for extension, though parameters can of course always be ignored. Tony Finch shows a nasty CPP trick to hide the braces at: Named and optional function arguments in C99 http://fanf.livejournal.com/148506.html [[ slightly edited to correct the terminology and coding style... ]] [[ see also ~/src/tc99namedparams.c ]] The main limitation is there must be at least one non-optional argument, and [[if making use of default values]] you need to compile with: -std=c99 -Wno-override-init The pattern works like this, using a function called repeat() as an example: #include #include #include // Define the argument list as a structure. The dummy argument // at the start allows you to call the function with either // positional or named arguments. struct repeat_params { void *dummy; char *str; int n; char *sep; }; // Define a wrapper macro that sets the default values and // hides away the rubric. [[The magic is in the __VA_ARGS__]] #define repeat(...) \ repeat((struct repeat_params) { \ .n = 1, \ .sep = " ", \ .dummy = NULL, \ __VA_ARGS__ }) // Finally, define the function, // but remember to suppress macro expansion by wrapping the name // with parenthesis! char * (repeat)(struct repeat_params p) { char *r; if (p.n < 1) { return NULL; } r = malloc((p.n - 1) * strlen(p.sep) + p.n * strlen(p.str) + 1); if (r == NULL) { return NULL; } strcpy(r, p.str); while (p.n-- > 1) { // accidentally quadratic strcat(r, p.sep); strcat(r, p.str); } return r; } int main(void) { // Invoke it like this printf("%s\n", repeat(.str = "ho", .n = 3)); // Or equivalently printf("%s\n", repeat("ho", 3, " ")); exit(0); } - similarly C should have named return values and allow multiple values to be returned, similarly to Go. This would avoid having to define a custom struct for a multiple-return function. - having named values may require getting rid of the comma operator though, unless the syntax is sufficient to distinguish between expressions and return-value assignment. return set_some_global(), result; v.s.: return set_some_global(foo), res1 = 2, res2 = 4, err = 0; In both cases we would want the first expression to be evaluated, thus calling the function 'set_some_global(foo)', and the second expression to set all the desired (non-default) return values for the function. This does look potentially too confusing, but braces could be used to "contain" the multiple expressions of the return value, thus making clear where expressions are, but now disallowing comma operators in those expressions: return { res1 = 2, res2 = 4, err = 0 }; The next trick will be figuring out how to make these return values position independent, and also work for the idiom where they can be directly used as arguments to another function (here I show implicit named parameters with default values, and go-style return declaration, which follows the parameters declaration, with multiple named return values, also some with default values, but without the Go 'func' keyword, which is superfluous if not using Go's package namespaces): multi_return_func(char *p1 = "default", int i1 = 0, struct fooness foo) (int res1, int res2 = 0, int err = 1) { return {res1 = 2, err = 1}; } return_handler_func(int res1, int err, int res2) { return; } what's probably important here is that the symbol names for the return values of the first function, and the parameters of the second function. use: return_handler_runc(multi_return_func(j, k, l)); - it would be desirable to have some better way of using an identifier as a string (vs. stringification) and (especially) vice-versa such that the compiler could offer some helpers for symbol lookup and translation, and perhaps some verification of usage. - John Carmack on functional C: "if a function only references a piece or two of global state, it is probably wise to consider passing it in as a variable. It would be kind of nice if C had a "functional" keyword to enforce no global references." - both GCC and Clang now aggressively inline code, such as static functions, yet they often fail to provide the full features one expects from function call frames on the stack for debugging purposes, i.e. when compiled with '-g' - perhaps there could be a "nevernull" qualifier for function return values which would say that the function guarantees never to return a NULL pointer value (e.g. when allocation functions call abort() on failure) It would perhaps be mostly documentation to the user, though the compiler could provide a gentle reminder (of a path always taken) if a variable set from such a function is ever compared (un-modified) with zero in any way. - speaking of handling allocation failures -- (per-thread?) heapmark() and heapfree(mark) functions should be available that would have the semantics of alloca() such that error handling for complex code that does many allocations could free all those allocations in one go, kind of like a pool allocator with a poolfree() call, but this would allow standard malloc()/calloc()/realloc() allocations done, say, in a library to also be handled, e.g. if an error handler does longjmp() to escape the error. - Hoare, C.A.R., Null References: The Billion-Dollar Mistake, August 25, 2009 https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare Ideally it should be more difficult -- indeed very difficult -- to accidentally de-reference a NULL pointer. Modern Eiffel makes it impossible to dereference a NULL pointer. Perhaps in a safer-C dereferencing a NULL struct pointer could always work as if it is pointing at a new and unique for every reference 'auto' class (i.e. zero-filled) object -- at least for R-values, and maybe even for L-values as well. As per the following though, this was not easy to achieve in Eiffel: https://cacm.acm.org/magazines/2017/5/216322-ending-null-pointer-crashes/fulltext - NULL considered harmful: http://ftp.rodents-montreal.org/mouse/blah/2009-10-09-1.html - as an aside, mouse says "a void * nil pointer has the same size and representation (and parameter passing mechanism, if applicable) as a nil pointer of the relevant type." to imply that this may not always the case on all machines and/or compilers for a given machine. However if I'm not mistaken C (C90?) now guarantees that the representation of "(void *) 0" must be the same (in storage and parameter passing) as a nil pointer of _any_ type, _and_ simultaneously that storage filled with zero valued bytes must also be treated as a nil pointer when accessed as a pointer value. http://gustedt.wordpress.com/2010/11/07/dont-use-null/ - my response to the latter: I do agree with what you’ve said in that zero will suffice in every situation where a null pointer value is required. However I do still believe however that there is significant value in expressing it as “NULL” instead of “0″, as did the inventors of the language since they said so in their first very concise description of the language. Quoting from the 1978 1st edition of K&R, page 97,98: “We write NULL instead of zero, however, to indicate more clearly that this is a special value for a pointer.” I just wish NULL had been defined as an anonymous enum with the value of zero, instead of as a preprocessor macro. (It is very sad that a couple of rather important features of C, including enums, were not implemented until a few months after the first edition of the book had already gone to press. If the Internet had existed at the time then others studying and re-implementing the language might have known about these features far sooner. Unfortunately the Nov. 15, 1978 memorandum describing them did not reach anywhere near as wide an audience as the book did.) I also really despise the idiots in the standards committees who have allowed the alternative definition of NULL as "(void *) 0". Perhaps it was well meant, but in the end I believe it has caused no end of confusion and brain damage to endless numbers of programmers. As you've said in your more recent essay arguing against casts one should never cast NULL to the type of the variable in an assignment — rather one should always use implicit casts in true _assignments_. However that doesn't mean one will never have to cast either NULL or zero to a specific pointer type in some situations. It is very important to keep in mind that pointers may have different representations depending on what they are used to point to (even though the language explicitly requires all pointer types to have the same representation of a null pointer, and that the width of a null void pointer should be as wide as the widest pointer type). So, as you point out it becomes critical to also use a cast when you're using either NULL or zero as a function parameter value for a parameter which is expected to be a pointer value, and especially when this is done in a call to a function taking variable numbers of parameters which will be interpreted at runtime (though this is equally important when a prototype is not in the scope of the call). While "(void *) 0" does obviously mean the value is to be expressed a pointer and that it is an expression of the representation of a null pointer, I would suggest that the idiom of using "NULL" [[defined as the integer constant 0]] in C is strong and wide spread and important enough that is a good idea to continue that tradition, and so I argue against the advice in your essay above. - const Tony Finch says (of the const loophole created by the likes of strchr()): It would be much better if we could write something like const char *strchr(const char *s, int c); where 'const' indicates variable constness [[with tag 'A']]. Because the same variable[[tag]] appears in the argument and return types, those strings are either both const or both mutable. He then offers this alternative using C11 '_Generic()': #ifdef strchr #undef strchr #endif Then we can create a replacement macro which implements parametric constness using _Generic() #define strchr(s, c) _Generic((s), \ const char * : (const char *)(strchr)((s), (c)), \ char * : (strchr)((s), (c))) But whether this will continue to work or not depends on the compiler and the resolution of: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1930.htm - also there's this problem with the trick of replacing strchr() above: reserved standard library function implementations Standard C (C99?) no longer allows one to write one's own replacements for standard library functions. That is bogus beyond belief. Certainly it is messy for shared libraries that might internally call standard functions, but for static linking there cannot be any valid reason for disallowing replacement of any function -- first appearance wins. - C without CPP - needs language constructs to make token joining and stringification possible - keyword renaming pragmas for non-standard keywords (e.g.: to handle language extensions in a portable manner when two similar and otherwise compatible implementations choose a different spelling) - what did Plan 9 do without #ifdefs for conditional compilation within a function definition? or did they just avoid it? - something like macros, but not as stupidly inelegant as C++ templates - I think GCC (and Clang) is(are) stupid to include '-Wunknown-pragmas' in '-Wall', especially for those that are tagged with the implementation name (e.g. "#pragma impl diagnostic"). In general any unknown pragmas may very well be something used to add GCC(clang), or vice versa, compatibility to some other compiler! (and having to wrap the ones that are already tagged with the implementation name in "#ifdef __impl__" is, well, tremendously stupidly redundant and repetitive!) - GCC's "warning: signed and unsigned type in conditional expression [-Wsign-compare]" warning is kind of stupid. (Clang doesn't warn) - C needs "make this signed/unsigned" operators to avoid having to know the width of the source type. Perhaps: "(-)" and "(+)" would suffice? - GCC is very stupid to only issue -Wcast-align warnings when targeting a platform with stricter cast alignment requirements. - Go's "defer" is absolutely necessary! Especially without a GC! - Go's "panic()" and "recover" are a good idea too, i.e. for library code - UBSan is stupid to warn about "runtime error: member access within misaligned address" when the runtime processor does not have any such restriction. (See also h_aliases[] access in ~/src/tgethostby*.c) - ESR recommends back-porting ":=" from Go to C: http://esr.ibiblio.org/?p=7758 There's also "__auto_type" in both GCC and Clang which can be used similarly. #define let __auto_type const #define var __auto_type // Which can then be used like this: let x = floor(3.5); // x is const, so you aren’t allowed to mutate it. var y = x; // y is not const. This may be more sensible than ":=", which I find confusing and hard to spot in reading Go code. Note that in Go there's a apparent issue with multi-value returns and ":=" that can cause a variable to be re-declared. For some reason this behavior of a multi-variable ":=" is necessary to make it play nice with block scope, but I'm not sure I understand. This is yet another strong argument against the ":=" syntax. - Walter Bright (D Language) says many evils of passing arrays in C can be fixed: The C99 attempted to fix this problem, but the fatal error it made was still not combining the array dimension with the array pointer into one type. C can still be fixed. All it needs is a little new syntax: void foo(char a[..]) meaning an array is passed as a so-called "fat pointer", i.e. a pair consisting of a pointer to the start of the array, and a size_t of the array dimension. Of course, this won't fix any existing code, but it will enable new code to be written correctly and robustly. Over time, the syntax: void foo(char a[]) can be deprecated by convention and by compilers. Even better, transitioning to the new way can be done by making the declarations binary compatible with older code: #if NEWC extern void foo(char a[..]); #elif C99 extern void foo(size_t dim, char a[dim]); #else extern void foo(size_t dim, char *a); #endif This change isn't going to transform C into a modern language with all the shiny bells and whistles. It'll still be C, in spirit as well as practice. It will just relieve C programmers of dealing with one particular constant, pernicious source of bugs. [from http://digitalmars.com/articles/b44.html] - John Carmack on programming style: (summarised from http://number-none.com/blow/john_carmack_on_inlined_code.html) The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it. If a function is only called in a single place, the decision is fairly simple. In almost all cases, code duplication is a greater evil than whatever second order problems arise from functions being called in different circumstances, so I would rarely advocate duplicating code to avoid a function inline and delimit the minor functions [[...]] and enclosing them in bare braced sections to scope their local variables and allow editor collapsing of the section is useful. I know there are some rules of thumb about not making functions larger than a page or two, but I specifically disagree with that now -- if a lot of operations are supposed to happen in a sequential fashion, their code should follow sequentially. If a function is only called from a single place, consider inlining it. If a function is called from multiple places, see if it is possible to arrange for the work to be done in a single place, perhaps with flags, and inline that. If there are multiple versions of a function, consider making a single function with more, possibly defaulted, parameters. If the work is close to purely functional, with few references to global state, try to make it completely functional. Try to use const on both parameters and functions when the function really must be used in multiple places. - a not-bad style guide: https://git.sr.ht/~sircmpwn/cstyle