15-121: Java Prg


Assignment 4: The 3x+1 Problem

Due: Monday 12:01 AM, September 16, 2001


Description

The so-called Collatz function is a deceptively simple function defined on the positive integers: It is straightforward to write a Java class that implements coll. A single application of coll is not particularly interesting, but when one applies the function repeatedly something strange happens: after a while one always reaches 1 (the fixed point). For example, starting at 18 we get
18, 9, 14, 7, 11, 17, 26, 13, 20, 10, 5, 8, 4, 2, 1
Likewise, starting at 100 we get
100, 50, 25, 38, 19, 29, 44, 22, 11, 17, 26, 13, 20, 10, 5, 8, 4, 2, 1
Of course, these are just two examples, but one can try out many more starting values, always with the same result: ultimately one reaches 1. This leads to the following conjecture.

Conjecture: Starting at any positive integer, repeated application of the Collatz function will ultimately produce 1.

The problem remains open, despite considerable efforts by a large number of people, and endless amounts of CPU time dedicated to it. The question is annoyingly simple to state, which makes it so particularly attractive -- somewhat like Fermat's last theorem, though that was recently solved. Any implementation of the Collatz 3x+1 function in Java using ordinary ints is really too feeble for interesting experiments: we need to be able to cope with large numbers. As it turns out, numbers of the form x = 2^k - 1 are particularly interesting. Your code should cope nicely with values of k around 5000.

For this assignment you will write a class Collatz that uses a dynamic array of boolean to represent numbers in binary. The class is used to determine the number of steps needed to reach 1, the so-called stopping time, starting at numbers of the special form x = 2^k - 1.

If one prints the binary digits as white/blue pixels the evolution of any such number looks typically like so. Each column of pixels represents one number in the computation. Note the solid triangle on the left, can you explain where it comes from?

For all values of x up to 500 one can see the stopping times in the following picture.
This is somewhat symptomatic of the Collatz problem: there clearly is some structure, stopping times are not random. But it is extremely difficult to say exactly what this structure is. The following picture shows the values encountered for starting value x = 2^33 - 1, plotted the standard way.
Note how the numbers get larger for a while before they shrink, and finally lead to 1. Also note the scale, the tail on the right only appears flat.

More information on the Collatz problem can be found on the web. Google for "Collatz", or take a loog at


Files

The project is organized into the following two files: You can get the file RunCollatz.java from the Assignment 4 directory on Blackboard.

The first implements the Collatz class, and the second is a simple driver for that class. An outline of the driver will be given to you, do not make any changes that would affect the input/output behavior of the program; just fill in the part of the main loop that concerns your implementation of the Collatz class. It is probably a good idea to have the following public methods in the class, but you can structure your class any which way you like as long as it works in the driver.

      // constructor
      Collatz( int kk ) {...}

      // move to x/2 or (3x+1)/2
      // output  false  when fixed point 1 is reached
      // and     true  otherwise (keep running)
      public boolean  next() {...}
      
      // print run data
      public void  print() {...}

      // print current digit pattern
      public void  show() { ... }

Clearly, you do not want to allocate an array of k bits when dealing with starting value 2^k - 1. How much larger an array you need is best determined by a little experimentation.


Specification

Your program will not produce any graphics, just plain ASCII data. In plain mode, the output looks like this:
>java RunCollatz 1000 1010
1000   7841   4316   3525
1001   8804   4923   3881
1002   8805   4923   3882
1003   8806   4923   3883
1004   8807   4923   3884
1005   8808   4923   3885
1006   8809   4923   3886
1007   9127   5123   4004
1008   9128   5123   4005
1009   9129   5123   4006
1010   9130   5123   4007
The first column shows the value of k (as in x = 2^k - 1), the second the total number of steps till 1 is reached, and the next two a more detailed listing of the number of up-steps and down-steps. Note the strange pattern: up-counts stay constant for a while, but the down-count increases steadily in each such block. The strange thing is that the Collatz function remembers the last time the same up-count appeared, and the "fixes" the down-count correspondingly.

There is also an option -s which produces a primitive picture of the numbers encountered, written out in binary.

>java RunCollatz -s 15
xxxxxxxxxxxxxxx                 
xxxxxxxxxxxxxx x               
xxxxxxxxxxxxx   x             
xxxxxxxxxxxx x xx            
xxxxxxxxxxx    x x          
xxxxxxxxxx x  xxxx         
xxxxxxxxx   xx xx x       
xxxxxxxx x x   x   x     
xxxxxxx     x xx  xx    
xxxxxx x   xxx  xx  x  
xxxxx   x x x xx  xxx 
xxxx x xxxxxxx  xx x x
xxx    xxxxxx xx      x
xx x  x xxxx  x x    xx
x   xxxx xx xxxxx   x  x
x x xx  x  xxxx x xx xx
x xxx x xx    x  x x
xx    x x  xx xxxx
x x  xxxx x   xxx x
xx xx   x x x   x
x  x x xxxxxx  xx
xxxxxx xxxx xx  x               
xxxxx  xxx  x xxx              
xxxx xx x xxxx x x            
xxx  x    xxx     x          
xx xxx   x x x   xx         
x  xx x xxxxxx  x  x       
xx     xxxxx xxx xx      
x x   x xxx  xx  x x    
x xxx x xx  xxxxx   
xx    x xx xxx x 
x x  xxx   xx   x
xx x x x  x xx
x       xxxx  x
x     x xx xxx
x   xxx   xx x
x x x x x    x
x  xx                      
xx  x                    
x xxx                   
xx x                 
x   x               
x xx              
x x

  15     85     44     41
Not great, but one can get some idea how the process works. Note that this plot only shows the odd numbers that appear in the iteration process. The last few steps shown correspond to 11, 17, 13, 5. You could fill in all the missing numbers, but then the plots become rather long and unwieldy.

You can find the results for k = 1,...,500 plus the plot above in the files collatz-500.txt and collatz-pic-15.txt in the assignment 4 directory on Blackboard.


Procedures

  1. Copy the file RunCollatz.java to a directory of your own. Make sure the directory is not readable by others. If you are not sure how to do this, go to polaris.andrew.cmu.edu and search for "directory protection".

  2. Write the Collatz class, and make sure that everything works properly. Make sure to place your code into a file called Collatz.java.

    You can use any editor you like for this, but I would strongly recommend to famliarize with emacs or xemacs now. It's by far the most powerful editor around, and will safe you tons of time in later courses. Integrated environments are nothing but a well-organized waste of time.

  3. To compile your code we will use the following command:
          > javac *.java
          
    To run your program on k = 20,...,30 type:
          > java RunCollatz 20 30
          

  4. Note that I am still waiting for AFS space. I will post instructions on exactly how to submit when the directories exist..

    When everything works properly, copy your files to your dropoff directory:

    /afs/andrew/course/15/121/Dropoff/hw04/andrew_login

    Files to hand in:

  5. Do not submit binaries, cores, or any test programs you may have written. The penalty will be horrible.


Grading

100 points total, distributed as follows: Make sure that your program compiles and runs properly. Outstanding solutions can get up to 20 extra credit points. As usual, take the 40 style points seriously, poorly commented spaghetti code is not acceptable.


Objectives

There are three issues here: class design, dynamic memory management for arrays, and a wee bit of algorithmic optimization. The Collatz class is not terribly complicated, but there is already some opportunity for design. Memory management in Java is easy, since, unlike with C++, you don't have to worry about desctructors and delete[]. Try to come up with some elegant ideas, but keep things simple.

Optimization is the fun part, make your code as fast as ever possible. Make sure that the computation of the Collatz function uses as few steps as possible. The down-steps should be O(1), but for the up-steps O(n) is the best you can hope for when dealing with numbers of size n bits. The trick here is to avoid unnecessary sweeps over the array. Don't copy or move data unless you absolutely have to. And so on.


Last modified: Wed Sep 12 11:46:02 EDT 2001