Friday, March 30, 2012

Please post project #1 questions as comments here

51 comments:

  1. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. I have following questions regarding project 1

      1. Should dumpFile() remove the file from memory?

      2. Should dumpFile() also dump the index(es) on the file?
      - If no, dumping the file and then restoring it loses the index(es) on that file, is this correct? (assuming we have not already used dupIndex() to dump the index)

      3. restoreFile() should restore only file/table contents and not the indexes, is this right? (even when called after exit() i.e. even if index files are present on the disk)

      4. Dumped file can be in any (human-readable or non-human-readable) format, right?

      5. Should rebuildFile() update the indexes on the file?

      6. rebuildIndex() should only physically remove deleted search keys or should it build new index from in-memory file?
      - In the second case, only new index will be generated from in-memory file and it only makes sense to call this method after calling rebuildFile()

      Delete
    2. 1. After called dumpfile(), the file should still in the memory until you call exit.
      2, dumpFile function you only need to dump File to disk, if we want to dump index, we will call dump index function. So don't worry about doing this into dumpFile().
      3. if we want to restore index, there is a function restoreIndex() in datafile class, so restoreFile() only restore the file contens.
      4. Yes, you can dump anything you think is useful for you in the file in disk in any format.
      5. Yes, rebuildFile() only update the file records.
      6.rebuildIndex() should update the whole structure of index in memory, you need to delete the pointers point to the records that have been deleted, and remove the search keys if necessary and handle the under flow problems.

      Delete
  2. I have few questions on print method in DataManager,

    The print formats are shown on the example of Movies table from homework 1. All indentation needs to be done with tabs.

    Format to print record:

    Year: 1954
    Director: Akira Kurosawa
    Budget: 2000000
    Title: Shichinin no samurai

    Here it is mentioned that I have to use tabs for indentation, but it seems if I use just newlines, that is enough. Do I miss something here?

    ReplyDelete
    Replies
    1. Use "\n"for print out different column, use ":space" to print out the values and keys.

      Delete
  3. When we load the DataFile, is it expected to load all the indexes assosiated with It?

    ReplyDelete
    Replies
    1. You don't need to do that. We will call restore index if we want to load all the indexes on the file.

      Delete
  4. The description of
    DataFile.insertRecord(): has
    "Also update indexes over the file."

    Does this imply saving indexes on DISK and MEMORY?
    or just in memory?

    ReplyDelete
  5. In the Class Index the return type of the method viewIndex() is void but in the Test cases file , ITests.java,in line number 59 we are expecting it to return a string? how do we handle this ?

    ReplyDelete
    Replies
    1. This actually is an error, Pls return a string for this function.

      Delete
  6. When I remove any record from particular index of a DataFile, it means basically I mark the record as Deleted in the DataFile. So during another search on different index, this record should not be returned. Is this correct?

    ReplyDelete
    Replies
    1. Could you reply for this post?

      Delete
    2. Right, a deleted record should not be returned.

      Delete
    3. Thank you, but my question was when I remove a record using index iterator method, does it have to remove the record from DataFile? What is the difference between Index iterator remove method and DataFile iterator remove method?

      Delete
    4. I have the same question

      Delete
  7. If record has null in some column and if we want to create an index on that column, how should we handle this case?

    ReplyDelete
    Replies
    1. You can assume that the records do not contain nulls.
      --jc

      Delete
  8. For rebuildFile, rebuildIndex, it seems there are no testcases available. Could you publish the same?

    ReplyDelete
    Replies
    1. There are no testcases available for the extra credit. You need to generate those yourself.

      The code will be checked manually.

      --jc

      Delete
  9. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. int whenToStart = 0;
      int offset = 0;
      int count = 0;
      String prevKey = null;
      String key = null;
      for(String line : si1.split("\n")){
      if(whenToStart++ < 2)
      continue;
      This code fragment seems to bypass the first two lines. .
      A workaround would be to ensure that the result i return from viewIndex() is prefixed with two empty lines. This seems to give me a correct result as well (checked with print statements)

      Delete
    2. This comment has been removed by the author.

      Delete
  10. Will there be additional test cases being applied to our code?

    Can we stop when we get a good score from the given testcases? Should we go in-depth into details and ensure that any other test is also handled?

    I could state one possible test that would help clarify my question

    df1 = new DataFile("df1",desc);
    df2 = DataManager.restoreFile("df1");

    Going by the laws of DataManager, the file name is no longer 'unique'. Should our code throw IllegalArgumentException in this case? Is it allowed?

    should our code be able to handle such abstruse scenarios as well?

    ReplyDelete
    Replies
    1. Based on your example, it seems that you at first dump your datafile df1 on the disk, then you have a new datafile df2, actually in this case you should throw out illegal argumentException. All the illegal scenarios are listed on the project description. Just follow those requirements, then you'll be fine.

      Delete
  11. In the leaf node, we are using lists to point to all recordsIDs of a particular key.
    So the node can have 8 maximum unique keys and unlimited recordIDs pointed to by the key.

    Is this acceptable? Will this affect how viewIndex evaluates our index? If yes, how do you want us to display the index?

    ReplyDelete
  12. The internal format is up to you but the print format should follow
    the specifications.

    --jc

    ReplyDelete
    Replies
    1. Bear in mind that index nodes have between 4 and 8 records. This is also the limit on the number of records IDs in each node.

      --jc

      Delete
    2. Professor can we relax this condition to 4 to 8 keys in the node? In that case the tree only has unique keys and it makes searching and returning recordIDs easier. I think it is more efficient too...

      Delete
  13. Your code still has to pass all tests. Then I'll consider giving full credit for your solution.
    --jc

    ReplyDelete
  14. We have implemented rebuildFile and rebuildIndex methods as following,

    For rebuildFile,

    Removing all the records which are marked as delete from the DataFile.HashMap(a map of records), and then call dumpFile() function.

    For rebuildIndex,

    Remove the current index from memory, and then insert all the records from DataFile.HashMap which are not marked as deleted in a new index object.

    From your blog post, it seems we have to implement delete functionality in B+ tree.

    Please let us know if our approach is correct.

    ReplyDelete
    Replies
    1. For rebuild file your understanding is correct. However for the rebuild index part, you can not delete the old index and create a new one, in this case that will be exactly the same with insert record index. For this part, what you should do is update current index in the memory, you delete record, delete key value, do some change to the internal node, you need to deal with the underflow problems.

      Delete
    2. I have question regarding the rebuildFile() function. When this function is called, should we call rebuildIndex() function on all the indexes that the current file has? Because when we rebuild file, we are hard deleting the records from the file. So in order to be consistent should we delete all the corresponding index entries from all the indexes?

      Delete
    3. Also, after hard deleting a record from file, if we leave the corresponding index entries undeleted, then it is difficult to delete those entries at later point I guess. So the best approach is to delete an index entry whenever a record is hard deleted from the file. Is this correct?

      Delete
  15. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. Prof.,

      What are we expected to submit , should it be just our source code package or our source code integrated with the JUnit test suite against which we are running our test cases?
      Also, do we need to attached any design doc or README along with it?

      Delete
    2. please only submit the source code package database, and use the submit command like you did for homework1.

      Delete
    3. Do each member of the group need to submit the same code individually?

      Delete
    4. Also, we are required to implement the classes in a package "database". But running the test cases require the package to be renamed as "reference". So we need to submit the source code as "reference" package or "database"?

      Delete
    5. pls name it as database, you only need to submit the code once, but include all members' ubit name in the folder name

      Delete
  16. Is there any limit on running time of some function(s), say iterator function(s) for data file?

    ReplyDelete
    Replies
    1. Yes, if any of the test run longer than 2 minutes, it will be considered as fail.

      Delete
    2. But Prof. has mentioned that the longest test (UniqueRemove())should not take more than 20 seconds.Is that not the case?

      Delete
    3. If you take a look at the instructions you will find out that even though it should take no more than 20 seconds, we still give 2 mins.

      Delete
  17. I am facing the following problem:

    All the test cases get successfully executed, and the final score gets printed. Then somehow restoreFile() method is getting called in DMTest.java file. This method executes the following statement

    df1 = DataManager.createFile("test2", descriptor1);

    and since df1 is already in memory, createFile throws exception.

    Is anyone else facing the same problem? or anybody has any idea regarding this?

    ReplyDelete
  18. Is your test suite getting executed twice by any chance?Are you just using SC.java to run all your test cases or your are executing the entire folder directory?

    ReplyDelete
  19. I was executing the entire folder directory. Just tried with SC.java. It worked fine.. thank you :)

    ReplyDelete
  20. The tests use the movieVault file, so our DataFile definition has to be the same as the DataFile that was used when dumping movieVault. We don't know how movieVault stores deleted records. Can you let us know?

    ReplyDelete
  21. Should rebuildFile() method dump the file after rebuilding it?

    It is not mentioned to dump the file in the project description, but in a reply to one of the comments, Zhouhan mentioned that rebuildFile() should dump the file.

    ReplyDelete
    Replies
    1. Hmm... I think it is okay if you just rebuild the datafile in memory, if we want to save the change to disk we can do this by just call the dumpfile function after we call rebuildFile() function

      Delete


  22. Colors School of Interior Design providingInterior Designing Courses in Chennai with International standard. We are the most reputed Interior Design School in India.

    ReplyDelete