CSE562-S12: Please post project #2 questions as comments here

Monday, April 16, 2012

Please post project #2 questions as comments here

24 comments:

Deepak RaykarApril 16, 2012 at 4:46 PM
1. Do we need to consider data types other than string and interger such as date,char etc?
2. How would we specify primary key?(Column level or table level)
3. What would be column length for type integer?
4. do we need to consider constraints other than primary key (e.g. non-null)?
ReplyDelete
Replies
SGApril 17, 2012 at 12:09 AM
Can we get a distribution of marks for the features? That will make it easier for us to evaluate our own project.
ReplyDelete
Replies
Srikanth Grama RamaprasadApril 18, 2012 at 9:23 AM
In the requirements page it is given that F(A1; : : : ;An) is a monotone function defined by a (+;*)-arithmetic expression with positive co-efficient over the attributes. What will be the format for this monotone function ? Can we get an example ? Also what will be the format for the component scoring functions ?
ReplyDelete
Replies
Jay PatelApril 18, 2012 at 7:07 PM
1. Do we need to ensure that the primary key constraint is maintained after each insertion into a table, or can this be assumed?

2. Can we use "DROP INDEX I ON TABLE T" rather than just "DROP INDEX I"? Most (but not all) versions of DROP INDEX include a reference to the table in the DROP INDEX statement (http://www.w3schools.com/sql/sql_drop.asp).

3. Are we required to support both the long and short form of the SQL INSERT statement, or is one or the other sufficient?

Thanks,
Jay
ReplyDelete
Replies
AbhishekApril 19, 2012 at 3:06 AM
1. What will be the format of the where clause?
Ex: (Salary > 10000)

2. Do we have to consider where clause with multiple conditions (connected by AND/OR)?
ReplyDelete
Replies
Jan ChomickiApril 19, 2012 at 6:21 AM
1-2. WHERE clauses can be disjunctions of conjunctions of arithmetic comparisons (=,>,...)

3. No nested subqueries.
ReplyDelete
Replies
kevinApril 19, 2012 at 8:19 AM
Can we use a open source parser (eg: javacc) for the parser part?
ReplyDelete
Replies
AbhishekApril 21, 2012 at 7:00 PM
1. F(A, B, C) = A * B + 2 * C

In this case, to implement the threshold algorithm, do we make 3 lists (Si), one for A, second for B and third for 2 * c?

OR

we make 2 lists (Si), one for A * B and other for 2 * C?

2. Can scoring function be like (A + B) + 2 * C?
How do we decide how many lists to make for threshold algorithm?
ReplyDelete
Replies
Jay PatelApril 24, 2012 at 3:38 AM
When traversing secondary indexes that are built on integers formatted as strings, the order of traversal will be different than if the integers were treated as integers. For example, as strings, all integers beginning with "1" will appear before all integers beginning with "2", so "10" will appear before "2" when the strings are sorted in "increasing" order. This clearly differs from integers that are sorted in increasing order, in which 2 will appear before 10.

How should we handle this difference in strings vs integers in light of the threshold algorithm?
ReplyDelete
Replies
Jay PatelApril 25, 2012 at 4:18 PM
Assuming that all attribute values are positive integers, we can handle this issue by simply zero-padding all attribute values when converting them to strings prior to insertion into the data file and various indexes. Something like the following could be used:

String.format("%012d", integerValue)

Can we make the assumption that all attribute values will be positive so that we can minimize the impact of this integer/string ordering issue in the indexes?
ReplyDelete
Replies
Jan ChomickiApril 26, 2012 at 6:36 AM
Thank you for your solution.

You can assume that attribute values are non-negative (they can be 0).

--jc
ReplyDelete
Replies
SuryaMay 5, 2012 at 12:00 AM
"1-2. WHERE clauses can be disjunctions of conjunctions of arithmetic comparisons (=,>,...)"

Can you please explain clearly? or give an example with the list of operators.
ReplyDelete
Replies
PhenoxisMay 5, 2012 at 2:15 PM
Whenever there are multiple records with the same value for a particular column, I find that my top-k queries differ from mySQL's top-k queries.
I narrowed this down to the fact that because the order of columns is different (although my sorted list and mySQL's sorted lists have the same values), the threshold is satisfied earlier in some cases and hence my output differs from mySQL's output.

There is however a big possibility that I have implemented the Threshold algorithm incorrectly.

Does anyone else suffer from a similar problem?
ReplyDelete
Replies

Add comment