Import/research/reflection/docs/README.txt at 968ecbf457fa6645a22c91c52e46b7adda1d700b

Fork: 0
jryland / Import
Find file
Newer
Older
Import / research / reflection / docs / README.txt
John on 29 Dec 2020 30 KB bulk import from macbookpro checkouts
Raw Blame History

Kai language
============

   - inspired by concept of writing own langauge
     by Jonathan Blow writing Jai
   - inspired by state of C/C++ with all the UB
     and complexity of C++11 to C++20
       - too much legacy
       - too many ways to do the same thing
   - inspired by Nikolas Wirth interview in 2018
     on youtube

Motivation:

 C++ complexity, how many ways to just initialize a variable?

    - with C++11 there are atleast 3 ways to initialize an int.
    - atleast 4 ways to init a struct or class

        int main()
        {
          struct Struct1 { int x = 0; };
          struct Struct2 { int x;     }; 
          struct Object  { Object(int a = 0) {} };
      
          int x1 = 0;
          int x2 = { 0 };
          int x3{ 0 };
      
          Struct1 s1;
          Struct2 s2 = { 1 };
          Struct2 s3{ 1 };
          Struct2 s4 = Struct2{ 1 };
      
          Object obj1;
          Object obj2(1);
          Object obj3{1};
          Object obj4 = Object(1);
          Object obj5 = Object{1};
        }

     - the preferred way and styles keep changing
     - perhaps it is too much to design something which is enduring
     - backwards compatibility is great but C++ doesn't recognize that backwards
       compatibility has two edges - there is the backwards compat of the code
       with new compilers, but also the backwards compat of running new code on
       old machines. Not everywhere will get the new support for new compiler features
     - if it is just syntactic sugar, backwards compat with old machines could be
       maintained by having code adapters / converters, but no such thing exists
       (originally C++ was implemented with CFront which was this, but those days
       are long gone). Nothing converts C++20 in to C++98 or C but theoretically
       it is possible, but not a great idea anymore.
     - C was great because it thinly abstracted the underlying machine which allowed
       real progress because routine code didn't need to be coded in assembler.
     - C++ hitched a ride on the success of C and tried to elevate concepts to higher
       levels. Unfortunately this muddies the design of C, in C it is WYSIWYG in that
       when you add two variables you get addition, when you call a function, it goes
       to that function etc. C is close to the machine. C++ allowed for C code to
       compile with its compilers, so all C code was essentially C++ code too, and
       so it was a type of embrace and extend strategy that has continued with the
       evolution of C++. Generally embrace and extend is an evil strategy but it works.
       It is a kind of parasitic strategy. It is retrospect that when it succeeds it
       can be seen as evil, but failed or less successful attempts to embrace and
       extend don't seem to be quite seen that way. There were other attempts to
       embrace and extend C, but the most successful was C++ which became the main
       line branch on which C continues. Java on the otherhand broke backwards
       compatibility so was not really applying an embrace and extend strategy, so
       was less evil and more pure in this respect and became wildly successful.
       However Java is not Native, but can get native like performance but you also
       must take with it its memory model which can hurt performance.
     - C++ was designed around mutation which is bad. It is bad from a performance
       point of view for the current evolution of CPUs. This needs to be explained
       a bit. C++ has classes, a class is like a struct, but with member functions.
       These member functions change the state of the member variables. This is how
       C++ is taught at university.
         class Color { public: setColor(int a,..) { r = a; } private: int r,g,b; }
       In C you might not write it in such a way if you know what you are doing.
       In functional programming, you don't mutate the state, you return a new
       object each time. With CPUs we will reach a point where single threaded
       performance plateaus but multi-threaded performance continues to grow.
       Programming needs to be more multi-threaded. The issue with multi-threaded
       programming is sychronization of memory between threads. A CPU thread can
       work fast if it is operating on memory that no other thread is operating
       on. Things that are referenced that mutate memory are bad for this.
       Local variables on the stack are in the set of memory that a single thread
       uses so these are fine to mutate. Pointers to things which can be mutated
       or any global mutable state are bad. A object of a class in C++ though
       is pretty much this, the this pointer is a pointer to mutable state.
       If you have a pure function which doesn't mutate any of the input parameters
       or any global state, and then outputs a new value, you can safely run this
       on any thread. All of the program if written like this can more easily be
       made to run different parts of the program on different threads.
       Example in C:
           struct Color { int r,g,b; };
           Color newColor(int r, int g, int b) { return Color{r,g,b}; }
           Color setRed(const Color& col, int r) { return Color{r,col.g,col.b}; }
       In the above example, setting the red value didn't mutate the orginal color
       object, instead it created a new color object with the specified red value.


  Cross-cutting concerns, aspect oriented programming, subject-oriented programming:

    Wikipedia:  Subject-oriented programming as a "third dimension"[edit]
      Method dispatch in object oriented programming can be thought of as
      "two dimensional" in the sense that the code executed depends on both the
      method name and the object in question. This can be contrasted[20] with
      procedural programming, where a procedure name resolves directly, or one
      dimensionally, onto a subroutine, and also to subject oriented programming,
      where the sender or subject is also relevant to dispatch, constituting
      a third dimension.

   So concept here is dispatch depends on current context, not just the object.

   Aspect oriented programming is simpler, it gives a way to intercept function
   calls and inject code in to the preamble and postamble of any function. This
   might be something which could be adapted to be used in a functional programming
   context.

Design goals:

 At first this langauge will be implemented as
 a translator which parses this and outputs C code
 which is then compiled with a C compiler (or C++ compiler).
 Rational - C compilers are mature at creating
 optimized code and will do better. Also aides in
 debugging the output - checking with linting tools etc.
 Can also be used to generate C code for use perhaps.

 Perhaps extension to this is outputting javascript,
 although enscripten/wasm allows converting C/C++ to
 web, perhaps the quality of javascript code from
 something like this might be better.

 A later stage of development will be able to
 output directly machine code without converting to
 C first, perhaps by adapting the output to be the
 IR (intermediate representation) for another compiler
 tool chain such as LLVM or GNU.

 Another direction perhaps is using this as a scripting
 langauge, so if can build an interpreter, or similar is
 JIT converting to JS and running it via V8 or similar.

 And another stretch goal might be to be able to convert
 to GLSL/CUDA/OpenCL to execute on a GPU or some mix.
 Definately some modules will make writing something similar
 to shader code in this language possible.

 Proposed language grammar/syntax sample

  Syntax inspired by a mix of pascal and C
  a context free grammar
  ideally no preprocessor, code can be processed in a single parse
  where practical, fewer characters are required for a task
  template programming should feel like normal programming
  language designed to prevent common programmer errors

  Functions are pure functions
     - they do not mutate the input parameters
     - they do not mutate any global state
     - they can output a new object
     - because they don't mutate global state or inputs, then the only thing a function
         can do is return something. This means that an expression will do nothing unless
         there is both a rhs and lhs. So need to be assigning every expression. eg: can't
         call log("some message"); how to relax this or still make useful?
         How about point-cuts? Aspect-oriented programming? It might solve logging.
         Are callbacks a thing in functional programming?

Other Features / Inspiration:

  Rust
     - unsafe keyword to mark area using raw pointers etc.
     - heavy use of static analysis
     - but static analysis is not zero cost
           - cost to compile time is still a cost (not that C++ is fast to compile)
     - range-based loops can avoid range checks
     - data locking, not code locking
     - initialization of all variables
  Qt
     - ownership of objects - parent/child relationship, so clear ownership semantics
       of pointers, so easy memory management
  D
     - slices - views in to arrays
  Rust
     - cargo - build for multiple architectures / unknown architecture
  Variable initialization
     - OS when giving pages to a process needs to clear/wipe them first anyway
            - would be a security risk if a page of memory that was used by
              another process was handed to us without being cleared
     - within our process when memory is freed and then reused, it might not
            be cleared.
     - what reasons might it be valid to not initialize the memory of variables?
            - when a large allocation happens, it might just be virtually mapped,
              but after the first write it may then need to actually be assigned.
            - what needs to be avoided is reading before the memory has been written.
            - it can be avoided if we know it will be zero when we first read it.
            - OS policy might make this so, so can avoid clearing to zero again
              and avoid causing the memory to be committed. (if it is on demand
              allocated when written to)
      - imagine alloating GBs of memory, perhaps via a file mapping. That memory
        is just page table entries at first. When write to a specific address in
        the range, on the first write a fault occurs which is caught and this
        causes the OS to then find a free page which it then clears and then
        resumes execution at the instruction that caused the fault which will
        retry writing to that address which will then write to that page. Clearing
        the memory will cause the OS to then try to allocate all those pages,
        even if virtually they exist and if were attempted to be read would have
        returned zero, so in a virtual sense are all cleared already. It would
        be a waste. But it wouldn't be much of a waste if we did immediately
        need all of that memory and began reading and writing to all of it.

  Dlang mixins
     - instead of the C preprocessor and C++ templates, a mixin can be a function
       which generates code from strings. A template mixin is like a template
       pattern with substitution using the parameters to insert this in to the
       program. Instead of typename T, they use 'alias' as a generic type.
     - D mixins are not as general or as powerful as what can be done with the C
       preprocessor, but that may be deliberate so as they are more hygenic and
       are also not another language. There is more safety using mixins.
     - D template mixins are not as powerful as C++ templates, as there are
       limitations about type inference that C++ templates can use to generate
       code based on more nuanced context. D template mixins are good for simple
       cases.
  TypeScript mixins
     - similar to composition of interfaces with default implementations
     - side-steps the multiple-inheritance diamond problem
  TypeScript tsconfig.json
     - instead of Makefiles, a json file has compiler flags and list of sources.
     - perhaps defining how to invoke the compiler is part of the language?
  Vue/webpack/hot-reload
     - the hot-reload feature is great for fast prototyping
  TypeScript & SASS transcription
     - great idea to maintain compat, 'compiles' down to js/css
  Refactoring tools
     - being able for tools to parse the code is important and helps with better
       tooling.
     - eg: resharper
     - rename variables, functions etc.
     - reformat code according to a style guide
     - alphabetically order imports or other unordered lists
     - format tables to be aligned
     - consistent whitespace and other style
     - generate various code blocks automatically,
        - eg: switch statement for a given enum
        - a new function or module
  GraphQL
     - defined schema, query to this, result in preferred format
       extensible
  Atomontage
     - point clouds
  ECS
     - Entity component systems
  AOS/SOA
     - structs/arrays - packing, access - easily switchable without large code changes (Jai)
  Qt
     - UI framework and libraries
  Reflection/Object-tree/property pages
     - reflect to UI the state in memory
  Rendering backends
     - OpenGL/Vulkan/DirectX/Metal
  Scene-Graph
     - rendering driven by a logical scene-graph (in-memory might not be a graph/tree)
  SIMD optimizations
     - instruction level optimization
  Multi-threading
     - automatic thread / worker splitting / dynamic
  web
     - integrate with websockets and browsers/html etc
  security
     - security primatives supplied. Makes it easy to make things secure.

  Compile-time checks. Static strong typing.
  Static pointer/reference checking. Bounds checking.

Interface/Implementation:

  Binary only modules?
  Need to be able to have either:
     - module with source for interface and implementation
     - or source of the interface and binary of the implementation
  How to handle different platforms/cross-platform with binary distribution?
  Publish process probably should be able to sign the binary
  Publish ideally not to machine code, but to binary IR which can be
  compiled and optimized for target.
  Perhaps binary might be compiled C/C++ code, and the interface is just
  the equivelent Kai for the interface in C/C++. This means linkage
  with other languages is possible.


Mutation concept:

  In games, there are frames. Perhaps in a given frame there is the state.

  Then all the mutations are gathered for given input and given deltas and recalcs and are queued as
  a list of transactions to apply ready for the next frame.

  Then for the next frame the state is recalculated by appling all the mutations.

  I wonder if the concept could be applied to not just games.

  It means that can not make a state change request and then immediately get the new state. Must wait
  a frame for this, so not synchronous, but async as to when a change happens.


    class State
    {
        int GetValue() const               { return m_value; }
        State SetValue(int a_newValue)     { State s = this; s.m_value = a_newValue; return s; }

        int m_value;
    };

    class Frame
    {
        State m_state;
    };

    void frameLoop()
    {
      Frame current;
      Frame next;
      while (true)
      {
         // render using current

         // operate / update based on current
         next.m_state = current.m_state.SetValue(10);

         // swap
         current = next;
      }
    }

  Now imagine if make multi-threaded. Avoids race between readers and writers to current.
  However still race between multiple writers to next.


Nullable concept:

  Is nullable a requied concept? In a pointer based or references based language/system it
  can become required. But in a stack based system, there are no pointers so no concept of null.
  The object always exists. So pointer based objects can be not pointed to, which is null, or
  can point to an uninitialized or empty object, or can point to a valid object. A stack based
  system can not implicitly get a null for free, but is it desirable. Null pointers are one
  of the problems of pointer based systems. If fail to check a pointer and then dereference it,
  the program will likely crash. But then the code becomes littered with null checks decreasing
  readability. Both a real problems with having pointers. And it is not limited to pointers,
  the same applies to references which are pointers in disguise. It is only by convention that
  in C++ it is assumed a reference is a non-null pointer and an actual pointer is a nullable one.
  You can have a function which takes a reference, and with a pointer to an object, dereference it
  and pass it to this function without doing a null check and the reference will be a null one.
  It is by convention that you wouldn't do that and would null check the pointer before
  dereferencing it to ensure there are no null references, only nullable pointers.

  Is a null semantically just meaning empty or invalid? Sometimes a value could have sentinel
  value which means uninitialized or invalid or empty. Like a string which is empty or zero length.
  But how about an int. A value of 0 or -1 might be valid so can not be the sentinel value to mean
  empty or invalid. An extra bool needs to be attached to this to handle the concept. This might be
  quite a cost in space for small objects but may be preferable to having pointers to individual ints.
  With floats, there is a special sentinel value of NaN which could be used for this.
  For a bool, it would need to be one of three values, null, true or false. In JavaScript, these
  three are three different types as apposed to values. Javascript is not implicitly pointer based,
  but the values that are used are dynamically typed which means the type is carried around with
  the value. This allows coding in to the type a null state. The size of a null, true or false value
  will all be the same size which could be a byte.

  So there are two kinds of safety to consider. Run-time safety to prevent crashing, and semantic
  safety to ensure correctness of the meaning/interpretation of the data. Run-time safety can be
  ensured by not having nulls, instead relying on sentinel/default values which might be also valid
  values. This sacrifices semantic safety as it is ambiguous as to if the sentinel is valid or not.
  Semantic safety comes at the cost of runtime safety if it can not be guarenteed that all required
  null checking has been done.

  A language could perhaps give semantic safety and statically ensure runtime safety at compile
  time by static analysis of the dereferences to ensure they are after the required null checks.
  This might not need to be too costly if designed in to the language. This can be achieved with
  C++ and other languages with linting/analysis tools, but it is not a defined compile time error
  by the language, so it is an optional thing. 

  Rust is a good example of a language which does this.

  Optional - compiler checks that you have checked it is not None

  Borrow checker - can have one mutable reference and no const references, or can have one or many
  const references. Can then pass things using the restrict keyword to say they don't alias.

  In kai:

  Int32! X;   // nullable value

  if (X)
  {
    Int32 x = X;  // can now use it
  }

  translation:

  template <typename T>
  struct Nullable
  {
     bool m_isNull;
     T    m_value;
  };

  Nullable<Int32> _X;

  if (!_X.m_isNull)
  {
    Int32& X = _X.m_value;
    Int32 x = X.m_value;
  }


  Mutex - lock data, not code.
  Mutex is added to a struct with templates and need to lock to use the data
  The lock call on the data returns the accessible data with borrow-checker lifetime of the current scope.




Implementation:

  In Kai code:

    type Time     := Int32;  // strong typedef
    type Duration := Time;   // strong typedef, derivative type

    Int8 x;
    Time t;

    x := 1;
    t := 1;

  Translates to:

    Basic types:

      struct Int8  { int8_t   m_value = 0; }  // always initialized
      struct Int16 { int16_t  m_value = 0; }
      struct Int32 { int32_t  m_value = 0; }

    Derived types:

      struct Time     { int32_t  m_value = 0; }
      struct Duration { int32_t  m_value = 0; }  // The derivative type

    Consistent access is always with .m_value to get to the value.

      Int8 x;
      Time t;

      x.m_value = 1;
      t.m_value = 1;

  Rust lifetimes

       Foo a;
       let b = a;  // lifetime of 'a' ends here

     In C++:

       b = std::move(a);

     But in C++ it is up to the programmer to not use 'a' after this.
     It is similar to swap. If 'b' was originally uninitialized, it would be
     bad to use 'a' now.

     The goal is to avoid doing deep-copies to transfer values from one
     variable to another.

     Using references has the problem of ownership semantics. Which is the
     owning reference.

     Another solution is shared_pointers. The cost is loss of explict control
     of when the deallocation happens.


Syntax:

  '//'        - end of line comment
  '/*'...'*/' - multi-line comment

  Keywords:

    - module, interface, enum, type, struct, implementation, function, tests

  In C/C++, signed integer addition/subtraction have undefined behaviour

    int x, y;
    x = x + y;   // UB, compile can set fire to computer if it wants

    unsigned x, y;
    x = x + y;   // not-UB, this is fine

  Assumes 2's complement int representation
    - if a machine has an unusual representation, then users of
      that machine pay this price not everyone else with normal hardware
    - in many cases, sign vs unsigned are no different
    - can trivially cast to unsigned, perform the operation, and cast back
    - this avoids UB if one can assume 2's complement

  In C, code like this can be accidently written:
    if ( a = b )
  To prevent accidental assignment when wanting to compare:
    :=   assignment
    =    comparison
  Because functional style language, there are no ++ or += etc
  Assignment only happens to a new variable. Perhaps can infer the
  type.

  In C++ templates have been tacked on, templated functions are written like this:

    template <typename T>
    T func(T a, T b)
    {
      return a + b;
    }

  More natural would be:

    func(type T, T a, T b) => T
    {
      return a + b;
    }

  So in this way, a type is itself a type that can be passed around. This is C# like.

  Iteration, C#, python and C++11 range-based loops:

    for (x : arrayVector)
    for (x in array)
    for (x in range(0, 100))

  map/reduce/filter, co-routines ?

  how to make things like map/vector etc with functional programming?

  vector might be a pointer + length
  if add elements to it, then it might be the same pointer if there is space at the end
  the original vector is unaltered if it contains the original pointer + original length
  the new vector contains the same pointer but a new length
  same with removing elements at the end
  a linked list might be possible if can store some kind of edit history
  perhaps think of transactional DBs and journals. Perhaps keep a journal until
  no more references to the original.

  actually, really just create a new vector/container
  
  map,reduce,forEach,filter,zip,some/exists,every,find

  creating a new copy of a vector is fine if the size is not too large or the reduced size is small
  the internal implementation details can't construct a new vector and still be functional. I think
  the constraint is on the passed in parameters, not on the local variables, so a local variable can
  be mutated just fine. That is because it is on the stack and another thread shouldn't have a pointer
  or reference to it. It begs the question, how to have objects with functions which mutate them but
  restrict calling them. Perhaps the trick is that parameters to a function will always be passed by
  const reference, and therefore can never be modified. We can always return by value (RVO will make
  this cheap). Constructing the returned value can be done as the local variable will not be const
  and can then call anything on it which mutates it (or constructs it).

  sized types only (for portability and explicitness - but limits targetting really low-end devices)
     null, bool, int8, int16, int32, int64, flt32, flt64

  perhaps profiles? Can mark a module as not using floats or anything 64bit?

  string is a first class type
  string_view is not needed, strings contain the length so can construct a string from another string
  strings are utf8
  string formats are translatable

  because functional, only operators which make a new output,
       eg:   a = b + c
       not:  a += b         // this mutates a
  so operator overload is limited to a few less things
       +    (T, T) -> T
       -    (T, T) -> T
       *    (T, T) -> T
       /    (T, T) -> T
       %    (T, T) -> T

  null as a type and a value?
  void?
  pointers - ?
    if functional, are pointers a thing?
  Some concepts from rust could be 'borrowed'
    borrowed pointers if need pointers
    crates
    unsafe

  static_asserts supported
  inline C code perhaps - so can mark as unsafe?

  keywords:
       module, interface, enum, type, struct, implementation, function, tests
  alpha:
       [a-zA-Z]
  num:
       [0-9]
  alnum:
       alpha|num
  identifier:
       alpha alnum*
  number:
       num [0-9a-fA-F]* [h

 EBNF

 Starting from a Pascal example:

  Lexical Grammar
       <letter>          ::= [a-ZA-Z]
       <digit>           ::= [0-9]
       <id>              ::= <letter> { <letter> | <digit> | "_" }
       <digits>          ::= <digit> { <digit> }
       <integer literal> ::= <digits>
       <string literal>  ::= "\"" { < a char or escape char > } "\""
       <real literal>    ::= <digits> "." <digits> [ "e" [ <sign> ] <digits>] <string literal> ::= "\"" { < a char or escape char > } "\""
       <literal>         ::= <integer literal> | <real literal> | <string literal>
       <operators>       ::= "+" | "-" | "*" | "%" | "=" | "!=" | "<" | ">" | "<=" | ">=" | "(" | ")" | "[" | "]" | ":=" | "." | "," | ";" | ":" |
                             "!" | "||" | "|" | "&" | "&&" | "^" | "{" | "}"
       <keywords>        ::= "if" | "then" | "else" | "of" | "while" | "do" | "begin" | "end" | "var" | "array" | "procedure" | "function" | "program" | "assert"
       <predefined id>   ::= "Boolean" | "false" | "integer" | "read" | "real" | "size" | "string" | "true" | "writeln"

Context-Free Grammar

<program> ::= "program" <id> ";" <block> "."
<declaration> ::= "var" <id> { , <id> } ":" <type> |
"procedure" <id> "(" parameters ")" ";" <block> |
"function" <id> "(" parameters ")" ":" <type> ";" <block> <parameters> ::= [ "var" ] <id> ":" <type> { "," [ "var" ] <id> ":" <type> } | <empty> <type> ::= <simple type> | <array type>
<array type> ::= "array" "[" [<integer expr>] "]" "of" <simple type>
<simple type> ::= <type id>
<block> ::= "begin" <statement> { ";" <statement> } [ ";" ] "end"
<statement> ::= <simple statement> | <structured statement> | <declaration>
<empty> ::=
<simple statement> ::= <assignment statement> | <call> | <return statement> |
< read statement> | <write statement> | <assert statement>
<assignment statement> ::= <variable> ":=" <expr> <call> ::= <id> "(" <arguments> ")"
<arguments> ::= expr { "," expr } | <empty> <return statement> ::= "return" [ expr ]
<read statement> ::= "read" "(" <variable> { "," <variable> } ")" <write statement> ::= "writeln" "(" <arguments> ")"
<assert statement> ::= "assert" "(" <Boolean expr> ")"
<structured statement> ::= <block> | <if statement> | <while statement> <if statement> ::= "if" <Boolean expr> "then" <statement> |
"if" <Boolean expr> "then" <statement> "else" <statement> <while statement> ::= "while" <Boolean expr> "do" <statement>
      
<expr> ::= <simple expr> |
<simple expr> <relational operator> <simple expr>
<simple expr> ::= [ <sign> ] <term> { <adding operator> <term> }
<term> ::= <factor> { <multiplying operator> <factor> }
<factor> ::= <call> | <variable> | <literal> | "(" <expr> ")" | "not" <factor> | < factor> "." "size" <variable> ::= <variable id> [ "[" <integer expr> "]" ]
<relational operator> ::= "=" | "<>" | "<" | "<=" | ">=" | ">" <sign> ::= "+" | "-"
<adding operator> ::= "+" | "-" | "or"
<multiplying operator> ::= "*" | "/" | "%" | "and"


Discussion:

 UI should be an included part of the language / libraries.
 Too bad C/C++ do not have a solution for this. They miss out
 on learning about real problems programs need to solve to
 be useful. For example i18n. C/C++ does a terrible job about
 this, but despite attempting to be general, they are simply
 not. This is because they don't even attempt to tackle UI
 so it isn't even evident that they fail woefully at i18n.
 For example uint8_t is an unsigned char, so stream operators
 acting on it treat it like a char instead of an integer
 value. There are subtle differences like that that make it
 hard to do utf right. It can be done, but you have to fight
 against C++ to do it. It shows horrible bias, leaning towards
 anglo european users and is not inclusive of large portions
 of the world that do not used latin based langauges. The
 defacto standard for cross-platform UI is to use Qt which
 helps solve these i18n issues. It mostly ignores C++ and
 creates its own QString class. This causes fractures in
 the whole C/C++ egosystem with code that is using Qt and
 code which is not. Similar fractures occur with Boost.
 Some developers avoid it like the plague and unwilling to
 allow it to infect their code base, whilst others embrace
 it. Again causing another division. As C++ has been evolving
 Boost has proven to be a good testing ground for new C++
 features and some of those have been woven in to C++.
 This evolution does cause some ugly side-effects with
 code which used Boost and migrates to the standardized
 C++ versions of these features supporting both variations
 with conditional compilation. This can be as a result of
 on different platforms the adoption of those features in
 the compilers happening at a different pace. Ideally a
 more formal feature testing/preview system would be better
 like the TR technical review, but on a larger community
 wide contribution scale.

 Governance - community consultation. RFCs. Preview
 implementations. Test-bed servers, unit tests, regression
 test suites. A reference implementation, not just a
 standard which is a document. Aim for zero undefined
 behavior. Everything is defined or it is a compile error.
 Take a modern approach of targetting modern machines,
 not hyperthetical machines or esoteric machines. The
 code has to work on real machines to solve real problems
 and now. Adding two ints in C++ can have undefined
 behaviour if they overflow and the compiler is free to
 do what it likes in this case. That is a crazy thing
 as a programmer to accept, particulary given that
 nearly any machine being targetted today or even in the
 last several decades uses 2's complement and so is
 pretty well defined at the machine level.

 Bootstraping - how much of the libraries should be in
 the language. Some part will be Core and the rest the
 wider language / libraries.