SQL query with AND? - sql

I have a little problem writing a request.
With the data example below, i want to recover the archive_id that have a document_type_id = 18 and 20
+------------+------------------+
| archive_id | document_type_id |
+------------+------------------+
| 1 | 20 |
| 1 | 18 |
| 3 | 20 |
| 4 | 11 |
| 2 | 23 |
| 5 | 20 |
| 6 | 23 |
| 6 | 20 |
| 6 | 18 |
+------------+------------------+
Expected result :
+------------+
| archive_id |
+------------+
| 1 |
| 6 |
+------------+
Same question but with document_type_id = 18, 20 and 23
Expected result :
+------------+
| archive_id |
+------------+
| 6 |
+------------+
Thank for your help

A simple having count would do the trick.
First case
select archive_id
from your_table
where document_type_id in (18,20)
group by archive_id
having count(distinct document_type_id) =2;
https://dbfiddle.uk/MnjR_4a_
Second case
select archive_id
from your_table
where document_type_id in (18,20,23)
group by archive_id
having count(distinct document_type_id) =3;
https://dbfiddle.uk/v9m3nPiq

For a general solution, you could for example use an array containing the archive_ids you are searching for:
WITH searched(ids) AS (VALUES (ARRAY[18,20,23]))
SELECT tab.archive_id
FROM tab CROSS JOIN searched
WHERE tab.document_type_id = ANY (searched.ids)
GROUP BY tab.archive_id
HAVING count(DISTINCT document_type_id) = cardinality(searched.ids);

Related

Add Index to postgreSQL query result

My query result looks like this:
| A | B |
|-------|
| 1 | 2 |
| 1 | 4 |
| 1 | 6 |
| 1 | 9 |
| 1 | 1 |
| 1 | 6 |
| 1 | 9 |
Now I want to increase column A by the index of the result table, so the result would become like this:
| A | B |
|-------|
| 2 | 2 |
| 3 | 4 |
| 4 | 6 |
| 5 | 9 |
| 6 | 1 |
| 7 | 6 |
| 8 | 9 |
How can I do it?
Thanks!
You want row_number()
select (row_number() over (order by a) + 1) as A, b
from table t;
Maybe something like that:
SELECT
(row_number() OVER (ORDER BY A) + A) AS columnAIndex,
columnB
FROM ...
I don't have a PostgreSQL client installed here, therefore, i don't tested this query.

Is there an easier way to find the row with a max value?

I have a schema where these two tables exist (among others)
participation
+------+--------+------------------+
| movie| person | role |
+------+--------+------------------+
| 1 | 1 | "Regisseur" |
| 1 | 1 | "Schauspieler" |
| 1 | 2 | "Schauspielerin" |
| 2 | 3 | "Regisseur" |
| 3 | 4 | "Regisseur" |
| 3 | 5 | "Schauspieler" |
| 3 | 6 | "Schauspieler" |
| 4 | 7 | "Schauspielerin" |
| 4 | 8 | "Schauspieler" |
| 5 | 1 | "Schauspieler" |
| 5 | 8 | "Schauspieler" |
| 5 | 14 | "Schauspieler" |
+------+--------+------------------+
movie
+----+------------------------------+------+-----+
| id | title | year | fsk |
+----+------------------------------+------+-----+
| 1 | "Die Bruecke am Fluss" | 1995 | 12 |
| 2 | "101 Dalmatiner" | 1961 | 0 |
| 3 | "Vernetzt - Johnny Mnemonic" | 1995 | 16 |
| 4 | "Waehrend Du schliefst..." | 1995 | 6 |
| 5 | "Casper" | 1995 | 6 |
| 6 | "French Kiss" | 1995 | 6 |
| 7 | "Stadtgespraech" | 1995 | 12 |
| 8 | "Apollo 13" | 1995 | 6 |
| 9 | "Schlafes Bruder" | 1995 | 12 |
| 10 | "Assassins - Die Killer" | 1995 | 16 |
| 11 | "Braveheart" | 1995 | 16 |
| 12 | "Das Netz" | 1995 | 12 |
| 13 | "Free Willy 2" | 1995 | 6 |
+----+------------------------------+------+-----+
I want to get the movie with the highest number of people that participated. I figured out an SQL statement that actually does this, but looks super complicated. It looks like this:
SELECT titel
FROM movie.movie
JOIN (SELECT *
FROM (SELECT Max(count_person) AS max_count_person
FROM (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons) AS
maxCountPersons
JOIN (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons
ON maxCountPersons.max_count_person =
countPersons.count_person)
AS maxPersonsmovie
ON maxPersonsmovie.movie = movie.id
The main problem is, that I can't find an easier way to select the row with the highest value. If I simply could make a selection on the inner table and pick the row with the highest value on count_person without losing the information about the movie itself, this would look so much simpler. Is there a way to simplify this, or is this really the easiest way to do this?
Here is a way without subqueries:
SELECT m.title
FROM movie.movie m JOIN
movie.participation p
ON m.id = p.movie
GROUP BY m.title
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
You can use LIMIT 1 instead of FETCH, if you prefer.
Note: In the event of ties, this only returns one value. That seems consistent with your question.
You can use rank window function to do this.
SELECT title
FROM (SELECT m.title,rank() over(order by count(p.person) desc) as rnk
FROM movie.movie m
LEFT JOIN movie.participation p ON m.id=p.movie
GROUP BY m.title
) t
WHERE rnk=1
SELECT title
FROM movie.movie
WHERE id = (SELECT movie
FROM movie.participation
GROUP BY movie
ORDER BY count(*) DESC
LIMIT 1);

Get the Id of the matched data from other table. No duplicates of ID from both tables

Here is my table A.
| Id | GroupId | StoreId | Amount |
| 1 | 20 | 7 | 15000 |
| 2 | 20 | 7 | 1230 |
| 3 | 20 | 7 | 14230 |
| 4 | 20 | 7 | 9540 |
| 5 | 20 | 7 | 24230 |
| 6 | 20 | 7 | 1230 |
| 7 | 20 | 7 | 1230 |
Here is my table B.
| Id | GroupId | StoreId | Credit |
| 12 | 20 | 7 | 1230 |
| 14 | 20 | 7 | 15000 |
| 15 | 20 | 7 | 14230 |
| 16 | 20 | 7 | 1230 |
| 17 | 20 | 7 | 7004 |
| 18 | 20 | 7 | 65523 |
I want to get this result without getting duplicate Id of both table.
I need to get the Id of table B and A where the Amount = Credit.
| A.ID | B.ID | Amount |
| 1 | 14 | 15000 |
| 2 | 12 | 1230 |
| 3 | 15 | 14230 |
| 4 | null | 9540 |
| 5 | null | 24230 |
| 6 | 16 | 1230 |
| 7 | null | 1230 |
My problem is when I have 2 or more same Amount in table A, I get duplicate ID of table B. which should be null. Please help me. Thank you.
I think you want a left join. But this is tricky because you have duplicate amounts, but you only want one to match. The solution is to use row_number():
select . . .
from (select a.*, row_number() over (partition by amount order by id) as seqnum
from a
) a left join
(select b.*, row_number() over (partition by credit order by id) as seqnum
from b
)b
on a.amount = b.credit and a.seqnum = b.seqnum;
Another approach, I think simplier and shorter :)
select ID [A.ID],
(select top 1 ID from TABLE_B where Credit = A.Amount) [B.ID],
Amount
from TABLE_A [A]

Hive Find Start and End of Group or Changing point

Here is the table:
+------+------+
| Name | Time |
+------+------+
| A | 1 |
| A | 2 |
| A | 3 |
| A | 4 |
| B | 5 |
| B | 6 |
| A | 7 |
| B | 8 |
| B | 9 |
| B | 10 |
+------+------+
I want to write a query to get:
+-------+--------+-----+
| Name | Start | End |
+-------+--------+-----+
| A | 1 | 4 |
| B | 5 | 6 |
| A | 7 | 7 |
| B | 8 | 10 |
+-------+--------+-----+
Does anyone know how to do it?
This is not the most efficient way, but it this works.
SELECT name, min(time) AS start,max(time) As end
FROM (
SELECT name,time, time- DENSE_RANK() OVER (partition by name ORDER BY
time) AS diff
FROM foo
) t
GROUP BY name,diff;
I would suggest try the following query and build a GenericUDF to identify the gaps, much more easier :)
SELECT name, sort_array(collect_list(time)) FROM foo GROUP BY name;

Oracle rank function issue

Iam experiencing an issue in oracle analytic functions
I want the rank in oracle to be displayed sequentialy but require a cyclic fashion.But this ranking should happen within a group.
Say I have 10 groups
In 10 groups each group must be ranked in till 9. If greater than 9 the rank value must start again from 1 and then end till howmuch so ever
emp id date1 date 2 Rank
123 13/6/2012 13/8/2021 1
123 14/2/2012 12/8/2014 2
.
.
123 9/10/2013 12/12/2015 9
123 16/10/2013 15/10/2013 1
123 16/3/2014 15/9/2015 2
In the above example the for the group of rows of the empid 123 i have split the rank in two subgroup fashion.Sequentially from 1 to 9 is one group and for the rest of the rows the rank again starts from 1.How to achieve this in oracle rank functions.
as per suggestion from Egor Skriptunoff above:
select
empid, date1, date2
, row_number() over(order by date1, date2) as "rank"
, mod(row_number() over(order by date1, date2)-1, 9)+1 as "cycle_9"
from yourtable
example result
| empid | date1 | date2 | rn | ranked |
|-------|----------------------|----------------------|----|--------|
| 72232 | 2016-10-26T00:00:00Z | 2017-03-07T00:00:00Z | 1 | 1 |
| 04365 | 2016-11-03T00:00:00Z | 2017-07-29T00:00:00Z | 2 | 2 |
| 79203 | 2016-12-15T00:00:00Z | 2017-05-16T00:00:00Z | 3 | 3 |
| 68638 | 2016-12-18T00:00:00Z | 2017-02-08T00:00:00Z | 4 | 4 |
| 75784 | 2016-12-24T00:00:00Z | 2017-11-18T00:00:00Z | 5 | 5 |
| 72836 | 2016-12-24T00:00:00Z | 2018-09-10T00:00:00Z | 6 | 6 |
| 03679 | 2017-01-24T00:00:00Z | 2017-10-14T00:00:00Z | 7 | 7 |
| 43527 | 2017-02-12T00:00:00Z | 2017-01-15T00:00:00Z | 8 | 8 |
| 03138 | 2017-02-26T00:00:00Z | 2017-01-30T00:00:00Z | 9 | 9 |
| 89758 | 2017-03-29T00:00:00Z | 2018-04-12T00:00:00Z | 10 | 1 |
| 86377 | 2017-04-14T00:00:00Z | 2018-10-07T00:00:00Z | 11 | 2 |
| 49169 | 2017-04-28T00:00:00Z | 2017-04-21T00:00:00Z | 12 | 3 |
| 45523 | 2017-05-03T00:00:00Z | 2017-05-07T00:00:00Z | 13 | 4 |
SQL Fiddle