Generate multiple rows from row with bitmask - sql

Lets have table with 3 columns: key, value, and bitmask (as varchar; of unknown maximum length):
abc | 23 | 101
xyz | 56 | 000101
Is it possible to write query, where on the output I will get one row for every combination of key, value, and 1 in bitmask, with index of that 1 as integer column (doesnt matter if starting from 0 or 1)? So for example above:
abc | 23 | 1
abc | 23 | 3
xyz | 56 | 4
xyz | 56 | 6
Thanks for any ideas!

I think you might be better off choosing a maximum length for your varchar.
SELECT * FROM
table
INNER JOIN
generate_series(1,1000) s(n)
ON
s.n <= char_length(bitmask) and
substring(bitmask from s.n for 1) = '1'
We generate a list of numbers:
s.n
---
1
2
3
4
...
And join it to the table in a way that causes repeated table rows:
s.n bitmask
--- -------
1 000101
2 000101
3 000101
4 000101
5 000101
6 000101
1 101
2 101
3 101
Then use the s.n to substring the bitmask, and look for being equal to 1:
s.n bitmask substr
--- ------- ------
1 000101 --substring('000101' from 1 for 1) = '1'? no
2 000101 --substring('000101' from 2 for 1) = '1'? no
3 000101 --substring('000101' from 3 for 1) = '1'? no
4 000101 --substring('000101' from 4 for 1) = '1'? yes...
5 000101
6 000101
1 101
2 101
3 101
So the s.n gives us the number in the last column of your desired output, and the where filters to only rows where the string substring works out

Related

Use LEFT OUTER JOIN to include NULL values in query

I want the final query to include manufacturer_id | manufacturer_name | ice_cream_id | ice_cream_name so that the print includes also those manufacturer_names, which are included in the database but do not have any ice creams (NULL ice_cream_names). Then I want the results in ascending order by manufacturer.manufacturer_id, ice_cream.ice_cream_id which i already managed to do.
Here is my sample code and sample header of the dataset I deal with:
SELECT manufacturer.manufacturer_id, manufacturer.manufacturer_name, ice_cream.ice_cream_id, ice_cream.ice_cream_name
FROM ice_cream LEFT OUTER JOIN manufacturer
ON ice_cream.manufacturer_id = manufacturer.manufacturer_id OR manufacturer.manufacturer_name IS NULL
ORDER BY manufacturer.manufacturer_id, ice_cream.ice_cream_id ASC;
manufacturer
manufacturer_id manufacturer_name country
--------------- ----------------- ----------
1 Jen & Berry Canada
2 4 Friends Finland
3 Gelatron Italy
ice_cream
ice_cream_id ice_cream_name manufacturer_id manufacturing_cost
------------ ---------------- --------------- ------------------
1 Plain Vanilla 1 1
2 Vegan Vanilla 2 0.89
3 Super Strawberry 2 1.44
4 Very plain 2 1.2
ingredient
ingredient_id ingredient_name kcal protein plant_based
------------- --------------- ---------- ---------- -----------
1 Cream 400 3 0
2 Coconut cream 230 2.3 1
3 Sugar 387 0 1
4 Vanilla extract 12 0 1
5 Strawberry 33 0.7 1
6 Dark chocolate 535 8 1
contains
ice_cream_id ingredient_id quantity
------------ ------------- ----------
1 1 70
1 3 27
1 4 3
2 2 74
2 3 21
2 4 5
3 1 60
3 3 10
3 5 30
4 2 95
4 4 5
I wonder what's the logic between FROM table1 LEFT OUTER JOIN table 2; Are those in right order? And I think I do something extra in the "ON" stage that should be done in WHERE?
You want to keep all manufacturers according to your description. Hence, that table should be the first table in the LEFT JOIN. I would also suggest using table aliases:
SELECT m.manufacturer_id, m.manufacturer_name, i.ice_cream_id, i.ice_cream_name
FROM manufacturer m LEFT JOIN
ice_cream ic
ON ic.manufacturer_id = m.manufacturer_id
ORDER BY m.manufacturer_id, ic.ice_cream_id ASC;
This doesn't require any fiddling with the ON clause, just proper use of the LEFT JOIN.

Why can I get the FAILED:Invalid table alias or column reference ‘ ’: (possible column names are: line) when I queried in Hive?

I have a table which struct is:
line string
and the content is:
product_id product_date orderattribute1 orderattribute2 orderattribute3 orderattribute4 ciiquantity ordquantity price
1 2014-09 2 1 1 1 1 3 153
1 2014-01 2 1 1 1 1 1 153
1 2014-04 2 2 1 1 1 1 164
1 2014-02 2 1 1 1 3 4 162
1 2014-07 2 1 1 1 9 23 224
1 2014-08 2 1 1 1 1 7 216
1 2014-03 2 1 1 1 3 13 180
1 2014-08 2 2 1 1 4 6 171
1 2014-05 2 1 1 1 3 7 180
....
(19000 lines omited)
the total price of every line above is ordquantity*price
I want to get the total price of every month like this:
month sum
201401 ****
201402 ****
Accoding to just 10 lines in the table above,the sum of month 201408 is 7*216+6*171 which is derived from (1 2014-08 2 1 1 1 1 7 216 and 1 2014-08 2 1 1 1 1 7 216).
I use the code:
create table product as select sum(ordquantity*price) as sum from text3 group by product_date;
and I got the FAILED:
FAILED:Invalid table alias or column reference 'product_date': (possible column names are: line)
I am not familiar with Hive,I don't know how to solv the problem.
Did you just create the table with correct schema? Well in case, if you didn't
CREATE TABLE product
(product_id INT
product_date STRING
orderattribute1 INT
orderattribute2 INT
orderattribute3 INT
orderattribute4 INT
ciiquantity INT
ordquantity INT
price INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
And the code for your requirment,
SELECT product_date, SUM(ordquantity*price) FROM product
GROUP BY product_date;
Hope I answered your question. Yippee!!

How to Update column Num with an incremental number in Master-Detail with row_number()?

I want to update my column Acc.DocHeader.Num and Acc.DocItem.Num with an incremental number. I have:
UPDATE x
SET x.Num = x.newNum,x.iNum=x.newNum
FROM (
SELECT Num,iNum, ROW_NUMBER() OVER (ORDER BY DocCreateDate ,DailyNum) AS newNum
FROM (SELECT h.Num,h.DocCreateDate,h.DailyNum,i.Num iNum FROM Acc.DocHeader h INNER JOIN Acc.DocItem i ON i.DocHeaderRef = h.Id WHERE h.Year = 1395 AND h.BranchRef = 1) AS header
) x
Why do I get Derived table 'x' is not updatable because the modification affects multiple base tables?
DocHeader Table :
Id Num Year DocCreateDate
-------------------------------------------------------
1 NULL 1396 2016-03-20
2 NULL 1395 2016-04-02
3 NULL 1395 2016-04-05
4 NULL 1395 2016-04-10
DocItem Table:
Id Num DocHeaderRef
----------------------------------------------
1 NULL 1
2 NULL 1
3 NULL 1
4 NULL 4
5 NULL 4
6 NULL 3
7 NULL 3
8 NULL 3
output:
DocHeader Table:
Id Num Year DocCreateDate
-------------------------------------------------------
1 1 1396 2016-03-20
2 1 1395 2016-04-02
3 2 1395 2016-04-05
4 3 1395 2016-04-10
DocItem Table:
Id Num DocHeaderRef
----------------------------------------------
1 1 1
2 1 1
3 1 1
4 3 4
5 3 4
6 4 3
7 4 3
8 4 3
You are attempting to update columns from two different tables in a single update statement:
Num comes from Acc.DocHeader
iNum comes from Acc.DocItem
In SQL Server, you can only update one table at a time in an UPDATE.
You can update multiple tables in a single transaction. You can also use the OUTPUT clause to capture the values from the rows being updated. This answers the question of why you cannot do what you want.
I find your query a bit hard to follow -- and your question doesn't explain what you are trying to do -- so it is hard to suggest alternatives.

Query to multiply certain sets of rows on a single table

I've got a bit of a complicated query that I'm struggling with. You will notice that the schema isn't the easiest thing to work with but it's what I've been given and there isn't time to re-design (common story!).
I have rows like the ones below. Note: The 3 digit value numbers are just random numbers I made up.
id field_id value
1 5 999
1 6 888
1 7 777
1 8 foo <--- foo so we want the 3 values above
1 9 don't care
2 5 123
2 6 456
2 7 789
2 8 bar <--- bar so we DON'T want the 3 values above
2 9 don't care
3 5 623
3 6 971
3 7 481
3 8 foo <--- foo so we want the 3 values above
3 9 don't care
...
...
n 5 987
n 6 654
n 7 321
n 8 foo <--- foo so we want the 3 values above
n 9 don't care
I want this result:
id result
1 999*888*777
3 623*971*481
...
n 987*654*321
Is this clear? So we have a table with n*5 rows. For each of the sets of 5 rows: 3 of them have values we might want to multiply together, 1 of them tells us if we want to multiply and 1 of them we don't care about so we don't want the row in the query result.
Can we do this in Oracle? Preferably one query.. I guess you need to use a multiplication operator (somehow), and a grouping.
Any help would be great. Thank you.
something like this:
select m.id, exp(sum(ln(m.value)))
from mytab m
where m.field_id in (5, 6, 7)
and m.id in (select m2.id
from mytab m2
where m2.field_id = 8
and m2.value = 'foo')
group by m.id;
eg:
SQL> select * from mytab;
ID FIELD_ID VAL
---------- ---------- ---
1 5 999
1 6 888
1 7 777
1 8 foo
1 9 x
2 5 123
2 6 456
2 7 789
2 8 bar
2 9 x
3 5 623
3 6 971
3 7 481
3 8 foo
3 9 x
15 rows selected.
SQL> select m.id, exp(sum(ln(m.value))) result
2 from mytab m
3 where m.field_id in (5, 6, 7)
4 and m.id in (select m2.id
5 from mytab m2
6 where m2.field_id = 8
7 and m2.value = 'foo')
8 group by m.id;
ID RESULT
---------- ----------
1 689286024
3 290972773
Same logic; just removed the hard-coded values. posting this answer thinking might be helpful to some others.
SELECT a.id,
exp(sum(ln(a.val)))
FROM mytab a,
(SELECT DISTINCT id,
field_id
FROM mytab
WHERE val = 'foo') b
WHERE a.id = b.id
AND a.field_id < b.field_id
GROUP BY a.id;

SQL - conditional statements in crosstab queries - say what

I am working with MS Access 2007. I have 2 tables: Types of Soda, and Likeability.
Types of Soda are: Coke, Pepsi, Dr. Pepper, and Mello Yellow
Likeability is a lookup with these options: Liked, Disliked, No preference
I know how to count the number of Cokes or Mello Yellows in the table using DCount("[Types]", "[Types of Soda]", "[Types]" = 'Coke')
I also know how to count the number of Liked, Disliked, No preference.
("[Perception]", "[Likeability]", "[Perception]" = 'Liked')
But, what if I need to count the number of "Likes" by Type.
i.e. the table should look like this:
Coke | Pepsi | Dr. Pepper | Mello Yellow
Likes 9 2 12 19
Dislikes 2 45 1 0
No Preference 0 12 14 15
I know in Access I can create a cross tab queries, but my tables are joined by an ID. So my [Likeability] table has an ID column, which is the same as the ID column in my [Types] table. That's the relationship, and that's what connects my tables.
My problem is that I don't know how to apply the condition for counting the likes, dislikes, etc, for ONLY the Types that I specify. It seems like I first have to check the [Likeability] table for "Likes", and cross reference the ID with the ID in the [Types] table.
I am very confused, and you may be too, now. But all I want to do is count the # of Likes and Dislikes for each type of soda.
Please help.
Its not really clear (to me anyway) what your tables look like so lets assume the following
tables
Soda
------
Soda_ID (Long Integer (Increment))
Soda_Name (Text(50)
Perception
------
Perception_ID (Long Integer (Increment))
Perception_Name (Text(50)
Likeability
-----------
Likeability_ID (Long Integer (Increment))
Soda_ID (Long Integer)
Perception_ID (Long Integer)
User_ID (Long Integer)
Data
Soda_Id Soda_Name
------- ---------
1 Coke
2 Pepsi
3 Dr. Pepper
4 Mello Yellow
Perception_ID Perception_Name
------------- ---------
1 Likes
2 Dislikes
3 No Preference
Likeability_ID Soda_ID Perception_ID User_ID
-------------- ------- ------------- -------
1 1 1 1
2 2 1 1
3 3 1 1
4 4 1 1
5 1 2 2
6 2 2 2
7 3 2 2
8 4 2 2
9 1 3 3
10 2 3 3
11 3 3 3
12 4 3 3
13 1 1 5
14 2 2 6
15 2 2 7
16 3 3 8
17 3 3 9
18 3 3 10
Transform query You could write a query like this
TRANSFORM
Count(l.Likeability_ID) AS CountOfLikeability_ID
SELECT
p.Perception_Name
FROM
Soda s
INNER JOIN (Perception p
INNER JOIN Likeability l
ON p.Perception_ID = l.Perception_ID)
ON s.Soda_Id = l.Soda_ID
WHERE
p.Perception_Name<>"No Preference"
GROUP BY
p.Perception_Name
PIVOT
s.Soda_Name;
query output
Perception_Name Coke Dr_ Pepper Mello Yellow Pepsi
--------------- ---- ---------- ------------ -----
Dislikes 1 1 1 3
Likes 2 1 1 1