How do I store this in Redis?
+------+---------------+
| val1 | val2 |
+------+---------------+
| 51 | Urbis orbi |
| 77 | Occaecati |
| 51 | Ea eligendi |
| 77 | Consequasit |
| 51 | Hic unde |
+------+---------------+
Then, how do I count it in Redis?
e.g.
select count() as count from table where val1 = '51';
Each val1 may have multiple val2 related to it. So you can use Redis Lists where val1 values will be KEYs and val2 values will be elements in respective list.
Equivalent of insert query can be
LPUSH val1 val2
Equivalent of select count query can be
LLEN val1
Related
When I JOIN two tables in a SELECT statement, I get some duplicates in column 1 in the result set, like so:
| ID1 | ID2 |
| 12 | 34 |
| 12 | 4 |
| 123 | 1 |
| 123 | 4 |
But the combination of ID1, ID2 would always be unique in my result set.
I need to have a separate column that would uniquely identify the entire result set, in a way that each combination of ID1, ID2 would produce the same new identifier, no matter when I perform the select. I am not allowed to use a temporary tables, and obviously tricks like ROW_NUMBER () over the result set wouldn't work either, because they may produce different identifiers for the same combination of ID1, ID2 after some of the two tables gets updated.
I tried the technique of concatenating the two integers from the select statements into one integer, like so:
SELECT ID1, ID2,
CAST(CONCAT(CAST(ID1 AS nvarchar(100)), CAST(ID2 AS nvarchar(100))) AS INT) AS Identifier
but it does not solve the problem either, because now different combinations may get the same identifier:
'12' + '34' = '1234'
'123' + '4' = '1234'
I need to have something like:
| ID1 | ID2 | Identifier |
| 12 | 34 | 1 |
| 12 | 4 | 2 |
| 123 | 1 | 3 |
| 123 | 4 | 4 |
And when one of the tables gets a new row that corresponds to ID = 12 from the other table, the previously defined identifiers will remain the same, like so:
| ID1 | ID2 | Identifier |
| 12 | 34 | 1 |
| 12 | 4 | 2 |
| 12 | 2 | 578 |
| 123 | 1 | 3 |
| 123 | 4 | 4 |
That is, each combination of integers ID1, ID2 would produce the same new identifier, no matter when I perform the SELECT statement. How would I create that Identifier column to satisfy my needs?
If you know in advance the maximum value of these integers, then you could use arithmetics.
Say that they range from 0 to 999, then:
id1 * 1000 + id2 as identifier
I have the following dataset (simplified) that constists of a 'WORK_TYPE' and a 'TASKTIME' associated with that work type.
+-----------+----------+--------+
| WORK_TYPE | TASKTIME | OUTPUT |
+-----------+----------+--------+
| TYPE1 | 10 | 1 |
| TYPE1 | 20 | 1 |
| TYPE1 | 30 | 2 |
| TYPE1 | 30 | 2 |
| TYPE2 | 10 | 1 |
| TYPE2 | 10 | 1 |
| TYPE2 | 20 | 2 |
| TYPE2 | 20 | 2 |
+-----------+----------+--------+
I wish to use the width_bucket function on this dataset. However I want to partition the data by the work_types so each type is grouped up irrespective of the entire dataset.
SELECT
TASKTIME
,WORK_TYPE
,WIDTH_BUCKET(TASKTIME,0,100,30) AS TASKTIME_BUCKET
,WIDTH_BUCKET(TASKTIME,0,100,30) OVER (PARTITION BY WORK_TYPE) AS TASKTME_BUCKET_WT --This Errors
FROM TABLE1
The first width_bucket works, however buckets the values across the whole dataset.
I tried to use the OVER (PARITION BY WORK_TYPE) after the width_bucket, however this is causing the following error:ORA-00923: FROM keyword not found where expected
Any ideas?
If you want equal width buckets for each group, you can calculate separate min and max values for each gruop:
SELECT TASKTIME, WORK_TYPE,
WIDTH_BUCKET(TASKTIME, 0, 100, 30) AS TASKTIME_BUCKET
WIDTH_BUCKET(TASKTIME, MIN_TASKTIME, MAX_TASKTIME, 30) AS TASKTME_BUCKET_WT
FROM (SELECT t1.*,
MIN(TASKTIME) OVER (PARTITION BY WORK_TYPE) as MIN_TASKTIME,
MAX(TASKTIME) OVER (PARTITION BY WORK_TYPE) as MAX_TASKTIME
FROM TABLE1 t1
) t1
I'm working with Postgresql 8.0. I have a table as follows:
Table1:
ID | Val | Num | Val2
ABC | High | 22 | Low
ABC | Low | 2 | High
ABC | High | 16 | Low
DFG | High | 10 | High
DFG | High | 50 | High
DFG | Low | 3 | High
EGF | Low | 2 | High
2BD | Low | 34 | High
2BD | High | 2 | High
How can I get an output where for the same IDs in the first column, it checks the Val column, gives high precedence to 'High' value than 'Low' or 'Mod' and then from among the rows for that ID with 'High' in the Val column select the row with higher value in the 'Num' column. For the above sample the output should be as follows:
ID | Val | Num | Val2
ABC | High | 22 | Low
DFG | High | 50 | High
EGF | Low | 2 | High
2BD | High | 2 | High
Can someone guide me how to achieve this?
I'm trying it this way:
select a.ID, a.Val, a.Num, a.Val2
from
(select * from table1 where Val = ‘High’) a JOIN
(select * from table1 where Val = ‘High’) b ON
a.ID = b.ID
where a.Num > b.Num
But this will eliminate the EGF row from the output as well!!
select id, max(num) val, val2
from table1
group by id, val2
I have the following table:
CREATE TABLE yow(
userid INT,
itemid INT,
feedback INT,
value INT)
(userid,itemid,feedback) can be considered a primary key, where each of these tuples contains a value.
I want a query which returns a table with the following columns:
userid | itemid | col0 | col1 | col2
Where col0 contains value for all rows in yow where feedback = 0, and col1 contains value where feedback = 1 and so on.
I have a somewhat working query:
SELECT
yow.userid AS uid,
yow.itemid AS iid,
isNull(col0.value, 0) AS col0,
IsNull(col1.value, 0) AS col1,
IsNull(col2.value, 0) AS col2
FROM yow
LEFT JOIN yow AS col0 ON col0.userid=yow.userid AND col0.itemid=yow.itemid
LEFT JOIN yow AS col1 ON col1.userid=yow.userid AND col1.itemid=yow.itemid
LEFT JOIN yow AS col2 ON col2.userid=yow.userid AND col2.itemid=yow.itemid
WHERE col0.feedback = 0
AND col1.feedback = 1
AND col2.feedback = 2
GROUP BY uid, iid
The problem is that I can have a value for (userid,itemid) in col1 or col2 but not the others. With this query, those rows are filtered out instead of the missing cells defaulting to 0.
As an example, I am getting something like this:
+-------+-------+--------+--------+--------+
| UID | IID | COL0 | COL1 | COL2 |
+-------+-------+--------+--------+--------+
| 1 | 101 | 23 | 22 | 241 |
| 1 | 101 | 51 | 13 | 159 |
| 2 | 102 | 22 | 55 | 152 |
| 3 | 103 | 14 | 41 | 231 |
+-------+-------+--------+--------+--------+
But instead I want something like this, where the missing values of col0 are defaulted to 0.
+-------+-------+--------+--------+--------+
| UID | IID | COL0 | COL1 | COL2 |
+-------+-------+--------+--------+--------+
| 1 | 101 | 23 | 22 | 241 |
| 1 | 101 | 51 | 13 | 159 |
| 1 | 102 | 0 | 15 | 142 |
| 2 | 102 | 22 | 55 | 152 |
| 2 | 103 | 0 | 45 | 92 |
| 3 | 103 | 14 | 41 | 231 |
+-------+-------+--------+--------+--------+
Can anyone suggest a fix to my query or perhaps propose a better one? I'm running this on H2, so I reckon the query should be somewhat standard. Thanks:)
The where condition will filter out all the null (missing) entries. To avoid that, you need to move the feedback = x to the join condition. Try the following instead:
SELECT
yow.userid AS uid,
yow.itemid AS iid,
isNull(col0.value, 0) AS col0,
IsNull(col1.value, 0) AS col1,
IsNull(col2.value, 0) AS col2
FROM yow
LEFT JOIN yow AS col0 ON col0.feedback = 0
AND col0.userid=yow.userid AND col0.itemid=yow.itemid
LEFT JOIN yow AS col1 ON col1.feedback = 1
AND col1.userid=yow.userid AND col1.itemid=yow.itemid
LEFT JOIN yow AS col2 ON col2.feedback = 2
AND col2.userid=yow.userid AND col2.itemid=yow.itemid
GROUP BY uid, iid
I have this table.
+------------------------------------------------------------+
| ks | time | val1 | val2 |
+-------------+---------------+---------------+--------------+
| A | 1 | 1 | 1 |
| B | 1 | 3 | 5 |
| A | 2 | 6 | 7 |
| B | 2 | 10 | 12 |
| A | 4 | 6 | 7 |
| B | 4 | 20 | 26 |
+------------------------------------------------------------+
What I want to get is for each row,
ks | time | val1 | val1 of next ts of same ks |
To be clear, result of above example should be,
+------------------------------------------------------------+
| ks | time | val1 | next.val1 |
+-------------+---------------+---------------+--------------+
| A | 1 | 1 | 6 |
| B | 1 | 3 | 10 |
| A | 2 | 6 | 6 |
| B | 2 | 10 | 20 |
| A | 4 | 6 | null |
| B | 4 | 20 | null |
+------------------------------------------------------------+
(I need the same next for value2 as well)
I tried a lot to come up with a hive query for this, but still no luck. I was able to write a query for this in sql as mentioned here (Quassnoi's answer), but couldn't create the equivalent in hive because hive doesn't support subqueries in select.
Can someone please help me achieve this?
Thanks in advance.
EDIT:
Query I tried was,
SELECT ks, time, val1, next[0] as next.val1 from
(SELECT ks, time, val1
COALESCE(
(
SELECT Val1, time
FROM myTable mi
WHERE mi.val1 > m.val1 AND mi.ks = m.ks
ORDER BY time
LIMIT 1
), CAST(0 AS BIGINT)) AS next
FROM myTable m
ORDER BY time) t2;
Your query seems quite similar to the "year ago" reporting that is ubiquitous in financial reporting. I think a LEFT OUTER JOIN is what you are looking for.
We join table myTable to itself, naming the two instances of the same table m and n. For every entry in the first table m we will attempt to find a matching record in n with the same ks value but an incremented value of time. If this record does not exist, all column values for n will be NULL.
SELECT
m.ks,
m.time,
m.val1,
n.val1 as next_val1,
m.val2,
n.val2 as next_val2
FROM
myTable m
LEFT OUTER JOIN
myTable n
ON (
m.ks = n.ks
AND
m.time + 1 = n.time
);
Returns the following.
ks time val1 next_val1 val2 next_val2
A 1 1 6 1 7
A 2 6 6 7 7
A 3 6 NULL 7 NULL
B 1 3 10 5 12
B 2 10 20 12 26
B 3 20 NULL 26 NULL
Hope that helps.
I find that using Hive custom map/reduce functionality works great to solve queries similar to this. It gives you the opportunity to consider a set of input and "reduce" to one (or more) results.
This answer discusses the solution.
The key is that you use CLUSTER BY to send all results with similar key value to the same reducer, hence same reduce script, collect accordingly, and then output the reduced results when the key changes, and start collecting for the new key.