Hierarchical numbering in SQL result - sql

I have a query that does a few full joins on some generally hierarchical data. Categories have Groups, and Groups have Items, though Groups and Items may be unattached to Categories and Groups, respectively. The results I'd like would look like this:
I. NULL
A. NULL
1. Item Orphan <-- no category or group
B. Group Red <-- no category
1. Item Apple
2. Item Banana
II. Category Bacon
A. Group Blue
B. Group Taupe
1. Item Kiwi
2. Item Watermelon
III. Category Atari
A. Group Silver
IV. Category Maui
...
Note: I added the periods after the numbering to make the example more readable, but the output needn't have them.
Right now the results look like this:
Category Group Item
NULL NULL Orphan
NULL Red Apple
NULL Red Banana
Bacon Blue NULL
Bacon Taupe Kiwi
Bacon Taupe Watermelon
Atari Silver NULL
Maui NULL NULL
And what I need is:
Co Category Go Group Io Item
I NULL A NULL 1 Orphan
I NULL B Red 1 Apple
I NULL B Red 2 Banana
II Bacon A Blue 1 NULL
II Bacon B Taupe 1 Kiwi
II Bacon B Taupe 2 Watermelon
III Atari A Silver 1 NULL
IV Maui A NULL 1 NULL
Although, the A on the last row and the 1s in the Io column on rows with NULL Items could also be NULLs. The I and the A on the Orphan Item row are important, though.
edit: This is simplified from the actual code for readability. In my case Items may have 0 or more Sub-items, so there may be multiple rows for each Item where the values for Co, Go, and Io would simply be duplicated. Sub-items aren't numbered though, so there's no ordinal column for them. So an example of some full rows would look like this:
Co Category Go Group Io Item Supplier Supplier_phone
I NULL B Red 2 Banana NULL NULL
II Bacon B Taupe 1 Kiwi Steve 555-1234
II Bacon B Taupe 1 Kiwi Sally 555-4242
II Bacon B Taupe 2 Watermelon NULL NULL
My problem is twofold:
how do I do the different numbering for each level of the hierarchy and have it increment whenever that level changes?
how do I translate those numbers into a different ordinal space (e.g. translate 1,2,3 into a,b,c or I,II,III) and do so on a per-level basis?
This is in MS SQL Server 2012. I only have access to SQL, so T-SQL code is out.
For reference, the FROM and WHERE clauses together in my query are about 20 lines of code that I'd like to keep DRY. I've used CTEs before, but I'm not sure if that would be useful in this case or not.
I'm familiar with using row_number over (order by..., but I don't see how I could use it in this case.

For the group numbers, I believe you want DENSE_RANK. It is used the same way as ROW_NUMBER, except for groups.
Without seeing your real code, it will be something like:
SELECT DENSE_RANK() over (ORDER BY Category) as Co,
Category,
DENSE_RANK() over (PARTITION BY Category, ORDER BY [Group]) as Go,
Group,
--etc
That said, I have no idea how you will convert numerics to numerals/letters without functions.

Related

Multiple entries with the same reference in a table with SQL

In a unique table, I have multiple lines with the same reference information (ID). For the same day, customers had drink and the Appreciation is either 1 (yes) or 0 (no).
Table
ID DAY Drink Appreciation
1 1 Coffee 1
1 1 Tea 0
1 1 Soda 1
2 1 Coffee 1
2 1 Tea 1
3 1 Coffee 0
3 1 Tea 0
3 1 Iced Tea 1
I first tried to see who appreciated a certain drink, which is obviously very simple
Select ID, max(appreciation)
from table
where (day=1 and drink='coffee' and appreciation=1)
or (day=1 and drink='tea' and appreciation=1)
Since I am not even interested in the drink, I used max to remove duplicates and keep only the lane with the highest appreciation.
But what I want to do now is to see who in fact appreciated every drink they had. Again, I am not interested in every lane in the end, but only the ID and the appreciation. How can I modify my where to have it done on every single ID? Adding the ID in the condition is also not and option. I tried switching or for and, but it doesn't return any value. How could I do this?
This should do the trick:
SELECT ID
FROM table
WHERE DRINK IN ('coffee','tea') -- or whatever else filter you want.
group by ID
HAVING MIN(appreciation) > 0
What it does is:
It looks for the minimum appreciation and see to it that that is bigger than 0 for all lines in the group. And the group is the ID, as defined in the group by clause.
as you can see i'm using the having clause, because you can't have aggregate functions in the where section.
Of course you can join other tables into the query as you like. Just be carefull not to add some unwanted filter by joining, which might reduce your dataset in this query.

How to tell in a Query I don't want duplicates?

So I got this query and it's pulling from tables like this:
Plantation TABLE
PLANT ID, Color Description
1 Red
2 Green
3 Purple
Vegetable Table
VegetabkeID, PLANT ID, Feeldesc
199 1 Harsh
200 1 Sticky
201 2 Bitter
202 3 Bland
and now in my Query I join them using PLANT ID ( I Use a left join)
PLANT ID, Color Description, Feeldesc
1 Red Harsh
1 Red Sticky
2 Green Bitter
3 Purple Bland
So the problem is that in the Query you can see Red shows up twice! I can't have this, and
I'm not sure how to make the joins happen but stop reds from coming up twice.
It seems remotely possible that you're asking how do to group indication -- that is, showing a value which identifies or describes a group only on the first line of that group. In that case, you want to use the lag() window function.
Assuming setup of the schema and data is like this:
create table plant (plantId int not null primary key, color text not null);
create table vegetable (vegetableId int not null, plantId int not null,
Feeldesc text not null, primary key (vegetableId, plantId));
insert into plant values (1,'Red'),(2,'Green'),(3,'Purple');
insert into vegetable values (199,1,'Harsh'),(200,1,'Sticky'),
(201,2,'Bitter'),(202,3,'Bland');
The results you show (modulus column headings) could be obtained with this simple query:
select p.plantId, p.color, v.Feeldesc
from plant p left join vegetable v using (plantId)
order by plantId, vegetableId;
If you're looking to suppress display of the repeated information after the first line, this query will do it:
select
case when plantId = lag(plantId) over w then null
else plantId end as plantId,
case when p.color = lag(p.color) over w then null
else p.color end as color,
v.Feeldesc
from plant p left join vegetable v using (plantId)
window w as (partition by plantId order by vegetableId);
The results look like this:
plantid | color | feeldesc
---------+--------+----------
1 | Red | Harsh
| | Sticky
2 | Green | Bitter
3 | Purple | Bland
(4 rows)
I had to do something like the above just this week to produce a listing directly out of psql which was easy for the end user to read; otherwise it never would have occurred to me that you might be asking about this functionality. Hopefully this answers your question, although I might be completely off base.
Check array_agg function in the documentation it can be used something like this:
SELECT
v.plantId
,v.color
,array_to_string(array_agg(v.Feeldesc),', ')
FROM
vegetable
INNER JOIN plant USING (plantId)
GROUP BY
v.plantId
,v.color
or use
SELECT DISTINCT
v.plantId
,v.color
FROM
vegetable
INNER JOIN plant USING (plantId)
disclaimer: hand written, syntax errors expected :)

SELECT datafields with multiple groups and sums

I cant seem to group by multiple data fields and sum a particular grouped column.
I want to group Person to customer and then group customer to price and then sum price. The person with the highest combined sum(price) should be listed in ascending order.
Example:
table customer
-----------
customer | common_id
green 2
blue 2
orange 1
table invoice
----------
person | price | common_id
bob 2330 1
greg 360 2
greg 170 2
SELECT DISTINCT
min(person) As person,min(customer) AS customer, sum(price) as price
FROM invoice a LEFT JOIN customer b ON a.common_id = b.common_id
GROUP BY customer,price
ORDER BY person
The results I desire are:
**BOB:**
Orange, $2230
**GREG:**
green, $360
blue,$170
The colors are the customer, that GREG and Bob handle. Each color has a price.
There are two issues that I can see. One is a bit picky, and one is quite fundamental.
Presentation of data in SQL
SQL returns tabular data sets. It's not able to return sub-sets with headings, looking something a Pivot Table.
The means that this is not possible...
**BOB:**
Orange, $2230
**GREG:**
green, $360
blue, $170
But that this is possible...
Bob, Orange, $2230
Greg, Green, $360
Greg, Blue, $170
Relating data
I can visually see how you relate the data together...
table customer table invoice
-------------- -------------
customer | common_id person | price |common_id
green 2 greg 360 2
blue 2 greg 170 2
orange 1 bob 2330 1
But SQL doesn't have any implied ordering. Things can only be related if an expression can state that they are related. For example, the following is equally possible...
table customer table invoice
-------------- -------------
customer | common_id person | price |common_id
green 2 greg 170 2 \ These two have
blue 2 greg 360 2 / been swapped
orange 1 bob 2330 1
This means that you need rules (and likely additional fields) that explicitly state which customer record matches which invoice record, especially when there are multiples in both with the same common_id.
An example of a rule could be, the lowest price always matches with the first customer alphabetically. But then, what happens if you have three records in customer for common_id = 2, but only two records in invoice for common_id = 2? Or do the number of records always match, and do you enforce that?
Most likely you need an extra piece (or pieces) of information to know which records relate to each other.
you should group by using all your selected fields except sum then maybe the function group_concat (mysql) can help you in concatenating resulting rows of the group clause
Im not sure how you could possibly do this. Greg has 2 colors, AND 2 prices, how do you determine which goes with which?
Greg Blue 170 or Greg Blue 360 ???? or attaching the Green to either price?
I think the colors need to have unique identofiers, seperate from the person unique identofiers.
Just a thought.

sql loader associating variable number physical records with a single physical record

I have the following data:
Aapple mango wood
Bpine tea orange
Bnuts blots match
Ajust another record
Now I want every record beginning with 'A' to be associated with record beginning with 'B' until another 'A' record or non-'B' record is encountered.
For example from the above data,
I would like to retrieve the following data(2 records),
mango tea
mango blots
The number of B records following an A record is variable, that is,(A record might be followed by any number of B records(3 in the data below).
Aapple mango wood
Bpine tea orange
Bnuts blots match
Basdf asdf asdf
Ajust another record
So the resulting output would be
mango tea
mango blots
mango asdf
Is it possible to do the above using sql loader?. Any help/pointers would be most welcome.
Edit:
I was thinking about using CONTINUEIF clause, but there doesn't seem to be way to eliminate the records that was retrieved earlier.
For example, if I use,
CONTINUEIF NEXT PRESERVE(1)='B'
I would get "mango tea blots asdf" in one go and not
"mango|tea"
"mango|blots"
"mango|asdf"
I think i would load the records to 2 seperate tables based on the record type identifier,
see: http://download.oracle.com/docs/cd/B19306_01/server.102/b14215/ldr_control_file.htm#i1005614
and use recnum to preserve the order
see: http://download.oracle.com/docs/cd/B10501_01/server.920/a96652/ch06.htm
you can then transform the data in sql
SELECT
a.text,
b.text,
a.id,
a.nxtid
FROM
(
SELECT text,id, NVL(LEAD(seq,1) OVER (ORDER BY id),999999) AS NXTID
FROM t1
) a
LEFT JOIN t2 B ON b.seq > a.id AND b.id < a.nxtid

Manipulate the sort result considering the user preference - database

Suppose we have a simple database containing following data:
name
apple
pear
banana
grape
User want to sort those fruits by the name, and we will have, with no surprise
apple
banana
grape
pear
However, for some reason, user would like to place pear as the 3rd fruit ,that means he would like to have :
apple
banana
pear
grape
And, importantly, the user want to preserve this order when he want to sort the fruits by name thereafter.
How should we tackle this problem? Of top of my head, we could add a field user_sort_id, which will be updated when user sort and manipulate the sort result and we will use that filed as the sort key.
init value -> sort by name ->place pear as the seconds
name user_sort_id
apple 0 0 0
pear 1 3 2
banana 2 1 1
grape 3 2 3
This approach should work in theory. However, in practice, I can not think of an elegant and fast SQL statement that could accomplish this. Any ideas or alternatives?
If you want each user to have independent sort orders, you need another table.
CREATE TABLE user_sort_order (
name VARCHAR(?) NOT NULL REFERENCES your-other-table (name),
user_id INTEGER NOT NULL REFERENCES users (user_id),
sort_order INTEGER NOT NULL -- Could be float or decimal
);
Then ordering is easy.
SELECT name
FROM user_sort_order
WHERE user_id = ?
ORDER BY sort_order
There's no magic bullet for updating.
Delete all the user's rows, and insert rows with the new order. (Brute force always works.)
Update every row with the new order. (Could be a lot of UPDATE statements.)
Track the changes in your app, and update only the changed rows and the rows that have to be "bumped" by the changes. (Parsimonious, but error-prone.)
Don't let users impose their own sort order. (Usually not as bad an idea as it sounds.)