
Copyright 1998 Lars T Hansen.                   -*- text -*-

$Id: BUGS 2917 2006-04-27 21:25:06Z tov $


			 KNOWN BUGS IN LARCENY


These are bugs that are or may be visible to the Scheme programmer.
Outright compiler bugs are sometimes only known to Will.

Bugs are listed in the order recorded.  Fixed bugs have been moved to
the file BUGS-FIXED.  Each entry consists of an entry number, the
version of Larceny in which the bug was discovered, the date it was
recorded and who found it, a description of the problem, and any
supporting documentation, information, test cases, and so on.

Priority high: "Must be fixed ASAP"
Priority medium: "Should be fixed ASAP"
Priority low: "Fix whenever."


HIGH PRIORITY BUGS: none.


005  (v0.27e)
     Priority: low.
     Category: RTS / correctness

     While the system can now grow the heap to any size (by growing the
     descriptor tables), an attempt to start the system with more than 32MB
     or so of initial heap will result in a segmentation fault.

     The reason is that the size of the descriptor tables during startup
     is limited to 32 MB.  A larger size will result in an attempt to grow
     them, which results in a barrier function being called to install the
     new tables.  But at this point the barrier has not yet been initialized,
     so garbage is accessed.

     This is generally no longer a problem, since heaps are allocated
     lazily.  It will become a problem when someone tries to load a heap
     image of about 32MB or more.

034  v0.28f (970723 / lth)
     Priority: very low.
     Category: SPARCASM / correctness

     The number of arguments in a procedure call expression is limited to 
     1023.  This is due to the way the argument count is set up during 
     INVOKE and the way it is compared during ARGS=, ARGS>=.

     Note that expressions like (apply + massively-long-list) still work,
     because + takes 0 or more arguments (so the comparison is with 0), and
     apply computes the correct length of the list.  It is only a syntactic
     call expression that is so limited.

     This is easy to fix, and the fix does not impact code that has no
     more than 1023 arguments in a call.

     The compiler does in fact emit an 'invoke n' instruction for even 
     large n, so this is a real bug.

036  v0.28f (970826 / lth -- old bug)
053  v0.32 (971107 / lth)
     Priority: low.
     Category: LIB / quality-of-implementation

     Reader bugs.

   * The reader accepts \ as an initial character in symbols, but not
     as a subsequent, and when it is an initial character, it is not
     an escape character.

     I think the right behavior for \ is to work as an escape character
     in symbols and be valid everywhere in a symbol; \c should then mean
     literally c everywhere (notably preserving case, but also allowing
     spaces, control characters, and non-standard characters).  In 
     particular, if a symbol starts with \c, then even if symbols may not
     usually start with c, the interpretation should be a symbol.

   * Another reader problem:

	(symbol->string '|#x) => ""
	(symbol->string '|#x|) => "|#x"

     On the whole, the dark corners of the input syntax are indeed dark.

042 v0.31 (971024 / lth)
    Priority: low
    Category: LIB / correctness

    > (+ 'a)
    a
    > (* 'a)
    a

    [Thanks to Galen, who did not report this but who told me of this
     bug in the free version of Allegro Common Lisp for Windows(!).]

056 v0.32 (980126 / lth); v0.40 (981217 / lth)
    Priority: low
    Category: BUILD / correctness

    The development environment (both Chez Scheme and Larceny hosts):

	; Just finished make-larceny-heap
	> (inline-barrier)

	*** Error encountered!
	Deleting target file: larceny.heap

	Error: variable inline-barrier is not bound.
	Type (debug) to enter the debugger.

    In this case, it should not delete the previous target!  This did not
    happen on a subsequent try.  

    Notes:
    Looks like an error handler is not being taken down.

    In fact, it only appears to happen after a ^C has interrupted 
    a previous command; could it be that the interrupt handler does not
    interact properly with the error system?

057 v0.32 (980403 / lth)
087 v0.39 (981216 / lth)
    Priority: low
    Category: LIB / correctness

    Memstats and the static area:

    * Memstats does not return information that distinguishes the static
      area, if present.  Instead it returns information about Generation #n.
      So the info is there, it's just not being distinguished.

    Display-memstats:

    * Does not distinguish the static area in its printout (can easily use
      system-information to do this).

    * Displays information about the non-predictive remembered set even
      though it is not in use.

    (See also fixed bug 86, which solved part of the problem: gathering
     the statistics in the RTS.)

062 v0.33 (980430 / wdc)
    Priority: low
    Category: LIB / performance

    Floating point output is very, very slow.  The Dragon4 algorithm, which
    uses (slow) bignums, should be replaced by Dybvig's algorithm.

067 v0.34 (980603 / lth)
    Priority: low
    Category: LIB / correctness

    This is wrong:

	> (char-upcase #\)
	#\
	> (char-downcase #\)
	#\

    because these letters are not upper/lower case variants of each other. 
    This example presents a slight problem for existing character 
    processing code if an alphabetical is both upper and lower case,
    but character processing code is character set dependent anyway,
    so it must be fixed.

    [Must fix this both in the reader and in Lib/string.sch.]

069 v0.34 (980611 / lth)
    Priority: low
    Category: LIB / correctness

    Writer: this is weird but mostly benign because "read" discards the
    backslashes properly:

	> (write "a;b;c")
	"a\;b\;c"

    It's not benign when code parsing text created with write does not
    expect the backslashes; it should probably be fixed.  (On the other 
    hand it's no worse than quoting symbols with vertical bars.)

071 v0.34 (981016 / wdc)
    Priority: low
    Category: BUILD / quality-of-implementation

    DISASSEMBLE-FILE does not work on .fasl files.

    Notes:
    The problem here is that FASL files do not have sufficient info
    for disassembly -- the tags present in LOP files are missing.  This
    will be fixed when we go to a FASL format that allows separate 
    loading and linking, because that format will retain the tags.

074 v0.36 (981207 / dougo)
139 v0.46 (991023 / lth)
    Priority: medium.
    Category: Pretty printer / correctness

    (1) Records (defined with make-record-type et al in Auxlib/record.sch)
        are no longer displayed with the record type printer; it just displays
        "#<STRUCTURE>".

        The pretty printer does not use the structure printer to print 
        structures.   Fixing this will probably require changing the semantics
        of the structure printer, which now has no notion of "layout".

    (2) The pretty printer does not format very well; in particular, it formats
        things vertically that could be formatted horizontally.  Try:

          (pretty-print ((system-function 'sys$get-resource-usage)))

        for some pretty-line-length values; when I tried I could not get it to
        print the remset entries horizontally even with line length 150 when 
        all the lines fit on 100 characters.

077 v0.36 (981207 / lth)
    Priority: low.
    Category: LIB / correctness

    Jaffer's test suite:

	(#<PROCEDURE string->number> "3i")  ==> 0+3i
	 BUT EXPECTED #f
	(#<PROCEDURE string->number> "3I")  ==> 0+3i
	 BUT EXPECTED #f
	(#<PROCEDURE string->number> "33i")  ==> 0+33i
	 BUT EXPECTED #f
	(#<PROCEDURE string->number> "33I")  ==> 0+33i
	 BUT EXPECTED #f
	(#<PROCEDURE string->number> "3.3i")  ==> 0.0+3.3i
	 BUT EXPECTED #f
	(#<PROCEDURE string->number> "3.3I")  ==> 0.0+3.3i
	 BUT EXPECTED #f

    The problem is that the number syntax in the Report requires an 
    explicit sign if a complex has no real part.  (There are plenty of
    similar "bugs" in the number parser, see comments in the code.)

083 v0.38 (981215 / lth)
    Priority: medium.
    Category: TWOBIT / correctness
    See also: 091

    Bugs w.r.t. R4RS, R5RS, or the IEEE standard:

    * some top-level names are reserved by the macro expander:
      lambda, define, quote, begin, if, set!

    * some top-level names are reserved by the quasiquote macro:
      lambda, define, quote, begin, if, set!


084 v0.39 (981216 / lth)
    Priority: low
    Category: INTERPRETER / correctness
    See also: 085, 099, 153

    Minor interpreter bug: it gives bad error messages when using
    primitive names in environments that don't have those primitives.

	> (eval '(let ((a 1)) (foo a)) (null-environment 4))
	Error: Reference to undefined global variable `foo'.
	> (eval '(let ((a 1)) (car a)) (null-environment 4))
	Error: Attempt to apply #!undefined, which is not a procedure.

    The problem is that the interpreter has a fixed notion about what
    constitutes a primitive.  Technically, there should be a primitive 
    table that is relative to the environment.


088 v0.40 (981219 / lth)
    Priority: medium.
    Category: LIB / performance

    When running with (integrate-usual-procedures #f), arithmetic is quite
    slow, because the definitions of '+' etc are coded in Scheme using
    rest args.  Much better would be if they were written in MAL in 
    the same way as read-char / write-char, special-casing 0, 1, 2, 3, and
    4 arguments, or, even better, using case-lambda.

    On vega:

	> (default-code)
	#t
	> (define (fib n) (if (< n 2) n (+ (fib (- n 1)) (fib (- n 2)))))
	fib
	> (run-with-stats (lambda () (fib 30)))
	Words allocated: 10770618
	Words reclaimed: 10738790
	Elapsed time...: 5863 ms (User: 5840 ms; System: 0 ms)
	Elapsed GC time: 174 ms (in 41 collections.)
	832040

    In fact, the interpreter is only 50% slower:

	vega(197) % larceny -small
	Larceny v0.40 (precise:SunOS5:split) (lth 18-Dec-98 17:59:12)

	> (define (fib n) (if (< n 2) n (+ (fib (- n 1)) (fib (- n 2)))))
	fib
	> (run-with-stats (lambda () (fib 30)))
	Words allocated: 10770624
	Words reclaimed: 10733898
	Elapsed time...: 8376 ms (User: 8200 ms; System: 20 ms)
	Elapsed GC time: 85 ms (in 41 collections.)
	832040

    This would be better:

	(define +
	  (case-lambda
	   (() 0)
	   ((x) x)
	   ((x y) (+ x y))
	   ((x y z) (+ (+ x y) z))
	   ((w x y z) (+ (+ (+ w x) y) z))
	   ((x . rest) (let loop ((x x) (rest rest))
	                 (if (null? rest)
	                     x
	                     (loop (+ x (car rest)) (cdr rest)))))))

    By hacking together MAL code that does the above for + and -,
    we find that:

	> (fib-benchmark)   ; That's fib(30)

	--------------------------------------------------------
	Standard fib
	Words allocated: 474
	Words reclaimed: 0
	Elapsed time...: 2533 ms (User: 2520 ms; System: 0 ms)
	Elapsed GC time: 0 ms (in 0 collections.)

    (`<' of 2 arguments does not cons anything, nor does it go out-of-line,
     because it takes 2 or more arguments and the no-extra-arguments case
     is in-lined and only stores () in a register.)

    That isn't as good as Chez Scheme -- 1660 ms -- but it's a lot 
    better than it was!  

    The time improves to 2300 ms with slightly smarter code, and to 2150 ms
    if we can avoid the stack frame creation, which may be possible
    with fundamental MAL support (so we don't have to free up a register
    temporarily).

    We can do better by improving the speed of global procedure calls: 
    using a function cell, caching global cells.   These in themselves do 
    not appear to be enough, though --  experiments with caching and 
    unsafe-code does not reduce the time much from the above (only by 
    about 100ms).  

    Fact: the following (illegal) code for + and - results in a run time
    of 2300 ms:

	(define + (lambda (a b . rest) (if (null? rest) (+ a b) ...)))
	(define - (lambda (a b . rest) (if (null? rest) (- a b) ...)))

    Possible MAL support: 

      (args-switch ((2 1002) (0 1000) (1 1001) (3 1003) (* 1004)))

    which says go to 1002 if 2 args, 1000 if 0, ..., and to 1004 if none of
    the above.  Does not destroy RESULT.  Ordering can be used as an 
    optimization hint but has no semantic meaning.  It would be reasonable 
    to limit the argument count in the non-* case to some fairly small number.

    [Numbers to be taken with a grain of salt.]

    A caching strategy for 'known' globals, like that used by the interpreter,
    helps.  Of course, it bloats the code somewhat, but it appears that the 
    code size increase is linear if one is careful about lifting common 
    expressions.  Chez Scheme gets good performance apparently w/o doing this,
    because code that uses 'plus' for '+' runs as fast as code that uses '+',
    at optimize-level 1.

    (See file Lib/Common/arith.mal for sample implementation when fixing
     this.)

089 v0.40 (981219 / lth)
    Priority: medium.
    Category: RTS / quality-of-implementation

    Should catch SIGSEGV, SIGBUS because broken unsafe code can otherwise 
    cause a core dump.

    Notes:
    This is partly implemented in Rts/Sys/signals.c but not complete; a 
    minor redesign of the exception handling system seems in order.

095 v1.0a1 (981221 / lth)
    Priority: low
    Category: BUILD / correctness

    `make realclean' removes files that were in the distribution archive.
    That's not a real problem, because the files it removes are only there
    to allow an executable to be built without a host system; once it's 
    built, Larceny can be used as a host system.  Still, this is somewhat
    counter-intuitive.

100 v1.0a1 (990104 / lth)
    Priority: low.
    Category: TWOBIT / quality-of-implementation

    The routines m-warn and m-error used by the macro expander should
    signal their conditions using machinery that ensures that output 
    ports are still open (in the same way the REPL does this).  Perhaps
    this should wait until we have some exception machinery.

101 v1.0a1 (990104 / lth)
    Priority: low.
    Category: TWOBIT / quality-of-implementation

    From: will
    >A larger problem is that warnings and error messages ought to be
    >expressed in terms of the original source code, or as original as
    >macro-expanded source can be.  Thus the (.begin|18|21) form should
    >have been reported as (begin).  I doubt whether this will be fixed
    >anytime soon, though.

102 v1.0a1 (990104 / lth)
    Priority: low
    Category: LIB / quality-of-implementation

    Loader bug.

    The loader (really: the reader) produces a strange error when
    confronted with a file that is neither a Scheme source file nor a 
    FASL file, e.g. when given a heap image as in the popular error

	larceny sparc.heap

    (which tells larceny to load sparc.heap as a Scheme file).

    To fix this, the loader should set up an exception handler and
    catch (some) errors signalled by READ.

106 v1.0a1 (990108 / lth)
    Priority: medium.
    Category: LIB / quality-of-implementation

    This is silly:

	> (procedure-name max)
	.max|3

    I don't know whether the compiler or the assembler should be responsible
    for cleaning this up; either way, the name needs to be demangled.

112 v0.34 (981016 / wdc)
    Priority: medium
    Category: TWOBIT / quality-of-implementation

    Space leaks can result from stale registers and stack slots.

113 v0.34 (981016 / wdc)
    Priority: medium
    Category: TWOBIT / quality-of-implementation

    Space leaks can result from dead variables in registers or in
    the stack.

114 v0.34 (981016 / wdc)
    Priority: low
    Category: TWOBIT / quality-of-implementation

    Space leaks can result from dead variables in closures.

125 v1.0a1 (990429 / lth)
    Priority: low
    Category: FFI / correctness

    Sometimes Larceny dumps core on exit after the FFI has been used to
    load a file and link to functions in it repeatedly (as when errors
    have occurred).  Here's a backtrace:

	(gdb) where
	#0  0xef660cac in __do_global_dtors_aux ()
	#1  0xef660c84 in _fini ()
	#2  0xef7ca07c in ?? ()
	#3  0xef698d28 in _exithandle ()
	#4  0xef7080ac in exit ()
	#5  0x1a400 in UNIX_exit (code=0) at Sys/unix.c:279
	#6  0x19f7c in larceny_syscall (nargs=1, nproc=7, args=0x3d2a4)
	    at Sys/syscall.c:80
	#7  0x1b150 in C_syscall () at Sparc/cglue.c:201
	#8  0x1b380 in callout_to_C ()
	#9  0x3d298 in globals ()
	#10 0x15768 in main (argc=2, argv=0xeffff104) at Sys/larceny.c:159

    This could have happened because the shared library was recompiled
    and the operating system pages in the library directly from the
    file.

126 v0.43 (990629 / lth)
    Priority: medium
    Category: LIB / correctness

    The enable-interrupts primitive does double duty: it enables 
    interrupts, and it sets the timer value.  At the time I thought 
    this was very clever...  However, we get into problems in 
    call-without-interrupts:

	(define (call-without-interrupts thunk)
	  (let ((old #f))
	    (dynamic-wind 
	     (lambda () (set! old (disable-interrupts)))
	     thunk
	     (lambda () (if old (enable-interrupts old))))))
    
    As you can see, any time spent in the critical region is not deducted
    from the budget of the thread.  This seems fairly innocent.  However,
    consider this straightforward implementation:

	(define (block t)
	  (if (not *tasking-on*) (error "Tasking is not on."))
	  (if (not (task? t)) (error "BLOCK: " t " is not a task."))
	  (let ((critical? (tasks/in-critical-section?)))
	    (call-without-interrupts
	      (lambda ()
	        (cond ((tasks/runnable? t)
		       (run-queue.remove! *run-queue* t))
	              ((eq? t (tasks/current-task))
	               (tasks/switch #f critical?)))))))

    The task will be blocked, but when it is restarted by the scheduler
    with the full quantum, the call to call-without-interrupts will
    re-enable interrupts with the saved value (which can be anything).

    It's possible to code around that problem, but we shouldn't have to.
    So enable-interrupts should be split into two primitives:
    enable-interrupts, which takes no arguments, and timer-set!, which
    takes a new timer value.  disable-interrupts should return a boolean,
    and not #f or the current timer value.  If it turns out that we
    need to be able to read the timer, we can introduce a timer-get
    procedure later.

133 v0.43 (990903 / lth)
    Priority: low
    Category: TWOBIT, probably / IEEE FP std compliance

	> (+ 0.0 -0.0)
	-0.0
	> (+ -0.0 0.0)
	0.0
	> (define (x a) (+ a -0.0))
	x
	> (x 0.0)
	0.0

    This came up on comp.lang.functional.  David McClain clamed that
    (+ 0.0 -0.0) => 0.0 and my *inference* is that the IEEE std mandates
    this, in which case Twobit improperly constant-folds the first
    expression above.

    In any case the compiler and interpreter give different results for
    the first expression.

    It appears there's no way in larceny at present to distinguish -0.0
    from 0.0 without looking at the actual bits.

138 v0.45 (991015 / lth)
    Priority: low
    Category: RTS / correctness

    Description.

    I have discovered a problem in the GC write barrier in the NP collector.
    In the macro scan_core_partial() a pointer is added to the SSB of the NP
    extra set by another macro remember_vec(), the call to that macro is:

    (line 396)
        if (must_add_to_extra) remember_vec( tagptr( T_objp, VEC_TAG ), e );

    However the case being handled in scan_core_partial() is that for both
    procedures and vectors, thus we see that the pointer added to the SSB
    and remembered set may have the wrong tag (a vector tag for a procedure
    pointer).

    Impact.

    The problem only occurs when a procedure structure promoted to the GC
    young area contains a pointer to the GC old area: in this case only will
    an incorrect pointer be added to the SSB for the NP extra set.  Normally
    procedures are created on-the-fly by loading compiled code, by creating
    closures, by the interpreter (which actually creates them at a very high
    rate, which is why some scrupulous debugging code found the problem),
    and by the debugger.

    The GC currently treats vectors and procedures alike by stripping the
    tag and then handling the raw pointer, so I do not expect GC crashes due
    to mis-tagging.

    While the remembered set hashes the pointer by stripping the tag,
    comparisons in the hash table are on the full pointer, so there could
    plausibly be duplicate entries in the hash table.  Duplicates are
    unlikely however since promotion creates only one pointer and the other
    would have to be created by assignment, and procedures are usually
    side-effected only for initialization and those assignments should
    almost never trap in the write barrier.  If duplicates do occur, then an
    object may be scanned twice.  As far as I know, that does not create any
    problems for the collector.

    In summary it is extremely unlikely that the bug causes performance
    degradation, and as far as I can tell, it is not possible for it to
    cause GC crashes.

    Action.

    The bug should be fixed.  Fixing it will slow down the case when a
    pointer needs to be added to the SSB, probably not an issue in practice.

140 v0.47 (991117 / lth)
    Priority: low
    Category: LIB / correctness

    The console I/O code (Lib/Common/conio.sch) leaks file descriptors,
    slowly.  See comments in that file.  (The nonblocking I/O console
    code in Experimental/nonblocking-console.sch also has this problem.)

141 v0.47 (991121 / lth)
    Priority: low
    Category: ASSEMBLER / correctness

    The error codes reported for make-vector and make-procedure are wrong:

	> (make-vector -10)
	Error: make-vector-like: -10 is not an exact nonnegative integer.

	> (make-procedure -10)
	Error: make-vector-like: -10 is not an exact nonnegative integer.

    See Asm/Sparc/sparcprim-3b.sch: emit-make-vector-like!.

142 v0.47 (991121 / lth)
    Priority: low
    Category: ASSEMBLER / correctness

    There is no check that the value stored into a bytevector actually
    fits in a byte:

	> (let ((x (make-bytevector 1))) 
	    (bytevector-set! x 0 377) 
	    (bytevector-ref x 0))
	121

    (This may be a feature actually.)

143 v0.47 (991121 / lth)
    Priority: low
    Category: RTS / error handling

    Currently, the system will signal an error appropriately if the
    argument to make-vector, make-bytevector, or make-procedure is negative
    or if it is a bignum but not if it is a nonnegative fixnum larger
    than the max object size -- in that case, the system will panic.

    The current behavior is inconsistent and unreasonably brutal.  It 
    would be nice if some reasonable error could be signalled like in
    the other cases.

144 v0.47 (991121 / lth)
    Priority: low
    Category: RTS / error handling

    The number reported is obviously wrong but somewhat benign, given what
    the system does next:

	> (mkv (- (expt 2 29) 1))
	Larceny Panic: Can't allocate an object of size -2147483648 bytes: \
	\max is 16777215 bytes.

145 v0.47 (991122 / lth)
    Priority: low
    Category: Solaris FFI / usefulness

    Currently the FFI uses the RTLD_LOCAL flag to dlopen(); this seems
    most reasonable.  However, some libraries mysteriously require
    being linked with RTLD_GLOBAL, so far libjava.so has this problem.

    The "bug" is probably that the FFI only has the ability to use one
    or the other when loading a library; we could extend the FFI to
    accept configuration flags to work around issues like these.

146 v0.47 (991122 / lth)
    Priority: medium
    Category: TWOBIT / quality-of-implementation

    The macro expander needs to be tuned.
    The rest of Twobit needs to be tuned.

148 v0.48 (991123 / lth)
    Priority: medium
    Category: LIB / correctness

    The I/O system is not thread-safe.

149 v0.48 (991123 / lth)
    Priority: low
    Category: Language / consistency

    LOGAND, LOGIOR, LOGNOT, and LOGXOR should probably be called
    something like FXAND, FXIOR, FXNOT, and FXXOR for consistency with
    other fixnum operators' naming.  Ditto LSH, RSHA, RSHL.

150 v0.48 (991124 / lth)
    Priority: low
    Category: Language / usefulness

    There is no way to seed the random number generator, and it doesn't
    auto-seed on startup, so we get the same sequence every execution
    of the system.

151 v0.48 (2000-01-11 / lth)
    Priority: medium
    Category: TWOBIT / interface correctness

    Twobit generates the primop $const/setreg, which is not in the core
    instruction set.

166 v0.53 (2004-07-11 / jivera@flame.org)
    Priority: medium
    Category: TWOBIT / correctness

    From: William D Clinger <will@ccs.neu.edu>
    To: jivera@flame.org, larceny@ccs.neu.edu, will@ccs.neu.edu
    Subject: Re: Are Twobit's pass 2 rewrite rules valid?
    Date: Sun, 11 Jul 2004 22:04:09 -0400

    Yes, this is a bug.  I was aware of it, and it should have been
    in the list of known bugs.  I'll make sure it gets there.

    Will

    Dr. Clinger,

    I have been thinking about some of the transformations that Twobit
    performs in phase 2 [1], this one in particular:

      (E0 ... (begin --- E) ...)      (begin --- (E0 ... E ...))

    Wouldn't this rewrite prove problematic in the following example?

      (foo (begin (set! x #f) x)
           (begin (set! x #t) x))

    Following the above rewrite rule, this would be rewritten (along with
    nested begin collapsing) to:

      (begin (set! x #f)
             (set! x #t)
             (foo x x))

    The following snippet seems to confirm that fear:

      > (pass2 '((begin f)
                 (begin (set! x (begin #t)) (begin x))
                 (begin (set! x (begin #f)) (begin x))))
      (begin
        (set! x (begin #t))
        (set! x (begin #f))
        ((begin f) (begin x) (begin x)))

    Perhaps this is a known bug (it's not in Docs/BUGS of 0.52).

    [1] http://www.ccs.neu.edu/home/will/Twobit/p2local.html


167 v0.53  (2004-12-10 / lth)
    Priority: low
    Category: REPRESENTATIONS / Portability

    The data representations in Petit Larceny are quite abstract and
    precise layouts are defined by macros in C header files.  In most
    cases there's not much that can be done with this abstraction, but
    in the case of Unicode there is: one can go from a CHAR data
    representation that has 8 bits of data to one that has 24 bits of
    data, provided the character data use all available space.  But
    the char representation puts the character data in the middle 8
    bits.  For all C code this is abstracted away and it is easy to
    change the charcode() and int_to_char() macros.  But the heap dumper
    does not cooperate: it knows what a character looks like.

    The heap dumper should be parameterized to handle this, or
    perhaps, we should store the character data more conveniently for
    a Unicode port to get at them.
 

168 v0.53 (2005-05-29 / pnkfelix)
    Prioirty: medium
    Category: RTS / correctness
        
    Larceny segfaults when you attempt to dump the heap with very large strings.
    E.g.:
    (begin (define largestring (make-string 4093 #\x)) 
           (dump-interactive-heap "/tmp/dump.heap"))
    will segfault.  Length 4093 is a borderline case; a string of
    length 4092 works fine.
 
    I plan to work around this in the short term by not having any
    strings of that length in the source code (requires a fix to
    make-templates.sch).  But I should try to track down why this is
    happening (is it a heap-dumping bug, or a broken heap invariant?).

    NOTE: this bug may have been fixed with a change by Lars to
    Rts/Sys/heapio.c on Dec 13 2004; our repositories got out of sync.

NEXT ENTRY: 169

; eof
