Find specific values and trancate - sql

How to find group values?
Example I have following SKU's:
SkuId Description
VN0A46ZERWV113000M CLASSIC
VN0A46ZERWV112000M CLASSIC
VN0A46ZERWV111500M CLASSIC
VN0A3WCVAZ31XXL Modern
VN0A3WCVAZ310XL Modern
VN0A3WCVAZ3100S Modern
VN0A3WCVAZ3100M Modern
VN0A3TE3RCO113000M Not Classic
VN0A3TE3RCO112000M Not Classic
VN0A3TE3RCO111500M Not Classic
How to describe...:) So, I need find all Sku's with the same description, find the same part in SKU, and add new row after every group. In general, the same part is first 12 characters.
Example in Result:
SkuId Description
VN0A46ZERWV113000M CLASSIC
VN0A46ZERWV112000M CLASSIC
VN0A46ZERWV111500M CLASSIC
VN0A46ZERWV1 NEW
VN0A3WCVAZ31XXL Modern
VN0A3WCVAZ310XL Modern
VN0A3WCVAZ3100S Modern
VN0A3WCVAZ3100M Modern
VN0A3WCVAZ31 NEW
VN0A3TE3RCO113000M Not Classic
VN0A3TE3RCO112000M Not Classic
VN0A3TE3RCO111500M Not Classic
VN0A3TE3RCO1 NEW

If I understand correctly you can try to use UNION ALL and substring function to make it.
use substring to get the first 12 characters from SkuId column in subquery then distinct remove duplicate first 12 characters SkuId then UNION ALL two result set.
CREATE TABLE T(
SkuId VARCHAR(100),
Description VARCHAR(100)
);
INSERT INTO T VALUES ('VN0A46ZERWV113000M' ,'CLASSIC');
INSERT INTO T VALUES ('VN0A46ZERWV112000M' ,'CLASSIC');
INSERT INTO T VALUES ('VN0A46ZERWV111500M' ,'CLASSIC');
INSERT INTO T VALUES ('VN0A3WCVAZ31XXL' ,'Modern');
INSERT INTO T VALUES ('VN0A3WCVAZ310XL' ,'Modern');
INSERT INTO T VALUES ('VN0A3WCVAZ3100S' ,'Modern');
INSERT INTO T VALUES ('VN0A3WCVAZ3100M' ,'Modern');
INSERT INTO T VALUES ('VN0A3TE3RCO113000M' ,'Not Classic');
INSERT INTO T VALUES ('VN0A3TE3RCO112000M' ,'Not Classic');
INSERT INTO T VALUES ('VN0A3TE3RCO111500M' ,'Not Classic');
Query 1:
SELECT * FROM (
select SkuId,Description
from T
UNION ALL
SELECT distinct substring(SkuId,1,12) ,'New'
FROM T
) t1
order by SkuId desc
Results:
| SkuId | Description |
|--------------------|-------------|
| VN0A46ZERWV113000M | CLASSIC |
| VN0A46ZERWV112000M | CLASSIC |
| VN0A46ZERWV111500M | CLASSIC |
| VN0A46ZERWV1 | New |
| VN0A3WCVAZ31XXL | Modern |
| VN0A3WCVAZ310XL | Modern |
| VN0A3WCVAZ3100S | Modern |
| VN0A3WCVAZ3100M | Modern |
| VN0A3WCVAZ31 | New |
| VN0A3TE3RCO113000M | Not Classic |
| VN0A3TE3RCO112000M | Not Classic |
| VN0A3TE3RCO111500M | Not Classic |
| VN0A3TE3RCO1 | New |

I think the additional rows you want are:
select skuid, 'NEW'
from (select distinct left(skuid, 12) as skuid, description
from skus
) t;
For your data and probably for your problem, this will probably do:
select distinct left(skuid, 12) as skuid, 'New'
from skus;
If you specifically want to exclude "names" that have different descriptions:
select left(skuid, 12) as skuid, 'New'
from skus
group by left(skuid, 12)
having min(description) = max(description);
You can add these into the table using insert:
insert into skus (skuid, description)
select distinct left(skuid, 12) as skuid, 'New'
from skus;
If you just want a result set, then use union and the correct order by:
select skuid, description
from ((select skuid, description, 1 as priority
from skus
) union all
(select distinct left(skuid, 12) as skuid, 'New', 2
from skus
)
) sd
order by skuid, priority;

Related

HQL, insert two rows if a condition is met

I have the following table called table_persons in Hive:
+--------+------+------------+
| people | type | date |
+--------+------+------------+
| lisa | bot | 19-04-2022 |
| wayne | per | 19-04-2022 |
+--------+------+------------+
If type is "bot", I have to add two rows in the table d1_info else if type is "per" i only have to add one row so the result is the following:
+---------+------+------------+
| db_type | info | date |
+---------+------+------------+
| x_bot | x | 19-04-2022 |
| x_bnt | x | 19-04-2022 |
| x_per | b | 19-04-2022 |
+---------+------+------------+
How can I add two rows if this condition is met?
with a Case When maybe?
You may try using a union to merge or duplicate the rows with bot. The following eg unions the first query which selects all records and the second query selects only those with bot.
Edit
In response to the edited question, I have added an additional parity column (storing 1 or 0) named original to differentiate the duplicate entry named
SELECT
p1.*,
1 as original
FROM
table_persons p1
UNION ALL
SELECT
p1.*,
0 as original
FROM
table_persons p1
WHERE p1.type='bot'
You may then insert this into your other table d1_info using the above query as a subquery or CTE with the desired transformations CASE expressions eg
INSERT INTO d1_info
(`db_type`, `info`, `date`)
WITH merged_data AS (
SELECT
p1.*,
1 as original
FROM
table_persons p1
UNION ALL
SELECT
p1.*,
0 as original
FROM
table_persons p1
WHERE p1.type='bot'
)
SELECT
CONCAT('x_',CASE
WHEN m1.type='per' THEN m1.type
WHEN m1.original=1 AND m1.type='bot' THEN m1.type
ELSE 'bnt'
END) as db_type,
CASE
WHEN m1.type='per' THEN 'b'
ELSE 'x'
END as info,
m1.date
FROM
merged_data m1
ORDER BY m1.people,m1.date;
See working demo db fiddle here
I think what you want is to create a new table that captures your logic. This would simplify your query and make it so you could easily add new types without having to edit logic of a case statement. It may also make it cleaner to view your logic later.
CREATE TABLE table_persons (
`people` VARCHAR(5),
`type` VARCHAR(3),
`date` VARCHAR(10)
);
INSERT INTO table_persons
VALUES
('lisa', 'bot', '19-04-2022'),
('wayne', 'per', '19-04-2022');
CREATE TABLE info (
`type` VARCHAR(5),
`db_type` VARCHAR(5),
`info` VARCHAR(1)
);
insert into info
values
('bot', 'x_bot', 'x'),
('bot', 'x_bnt', 'x'),
('per','x_per','b');
and then you can easily do a join:
select
info.db_type,
info.info,
persons.date date
from
table_persons persons inner join info
on
info.type = persons.type

postgres insert data from an other table inside array type columns

I have tow table on Postgres 11 like so, with some ARRAY types columns.
CREATE TABLE test (
id INT UNIQUE,
category TEXT NOT NULL,
quantitie NUMERIC,
quantities INT[],
dates INT[]
);
INSERT INTO test (id, category, quantitie, quantities, dates) VALUES (1, 'cat1', 33, ARRAY[66], ARRAY[123678]);
INSERT INTO test (id, category, quantitie, quantities, dates) VALUES (2, 'cat2', 99, ARRAY[22], ARRAY[879889]);
CREATE TABLE test2 (
idweb INT UNIQUE,
quantities INT[],
dates INT[]
);
INSERT INTO test2 (idweb, quantities, dates) VALUES (1, ARRAY[34], ARRAY[8776]);
INSERT INTO test2 (idweb, quantities, dates) VALUES (3, ARRAY[67], ARRAY[5443]);
I'm trying to update data from table test2 to table test only on rows with same id. inside ARRAY of table test and keeping originals values.
I use INSERT on conflict,
how to update only 2 columns quantities and dates.
running the sql under i've got also an error that i don't understand the origin.
Schema Error: error: column "quantitie" is of type numeric but expression is of type integer[]
INSERT INTO test (SELECT * FROM test2 WHERE idweb IN (SELECT id FROM test))
ON CONFLICT (id)
DO UPDATE
SET
quantities = array_cat(EXCLUDED.quantities, test.quantities),
dates = array_cat(EXCLUDED.dates, test.dates);
https://www.db-fiddle.com/f/rs8BpjDUCciyZVwu5efNJE/0
is there a better way to update table test from table test2, or where i'm missing the sql?
update to show result needed on table test:
**Schema (PostgreSQL v11)**
| id | quantitie | quantities | dates | category |
| --- | --------- | ---------- | ----------- | --------- |
| 2 | 99 | 22 | 879889 | cat2 |
| 1 | 33 | 34,66 | 8776,123678 | cat1 |
Basically, your query fails because the structures of the tables do not match - so you cannot insert into test select * from test2.
You could work around this by adding "fake" columns to the select list, like so:
insert into test
select idweb, 'foo', 0, quantities, dates from test2 where idweb in (select id from test)
on conflict (id)
do update set
quantities = array_cat(excluded.quantities, test.quantities),
dates = array_cat(excluded.dates, test.dates);
But this looks much more convoluted than needed. Essentially, you want an update statement, so I would just recommend:
update test
set
dates = test2.dates || test.dates,
quantities = test2.quantities || test.quantities
from test2
where test.id = test2.idweb
Note that this ues || concatenation operator instead of array_cat() - it is shorter to write.
Demo on DB Fiddle:
id | category | quantitie | quantities | dates
-: | :------- | --------: | :--------- | :------------
2 | cat2 | 99 | {22} | {879889}
1 | cat1 | 33 | {34,66} | {8776,123678}

Can I count the occurences for postgres array field?

I have a table postgres that uses the array type of data, it allows some magic making it possible to avoid having more tables, but the non-standard nature of this makes it more difficult to operate with for a beginner.
I would like to get some summary data out of it.
Sample content:
CREATE TABLE public.cts (
id serial NOT NULL,
day timestamp NULL,
ct varchar[] NULL,
CONSTRAINT ctrlcts_pkey PRIMARY KEY (id)
);
INSERT INTO public.cts
(id, day, ct)
VALUES(29, '2015-01-24 00:00:00.000', '{ct286,ct281}');
INSERT INTO public.cts
(id, day, ct)
VALUES(30, '2015-01-25 00:00:00.000', '{ct286,ct281}');
INSERT INTO public.cts
(id, day, ct)
VALUES(31, '2015-01-26 00:00:00.000', '{ct286,ct277,ct281}');
I would like to get the totals per array member occurence totalized, with an output like this for example:
name | value
ct286 | 3
ct281 | 3
ct277 | 1
Use Postgres function array unnest():
SELECT name, COUNT(*) cnt
FROM cts, unnest(ct) as u(name)
GROUP BY name
Demo on DB Fiddle:
| name | cnt |
| ----- | --- |
| ct277 | 1 |
| ct281 | 3 |
| ct286 | 3 |

Count and name content from a SQL Server table

I have a table which is structured like this:
+-----+-------------+-------------------------+
| id | name | timestamp |
+-----+-------------+-------------------------+
| 1 | someName | 2016-04-20 09:41:41.213 |
| 2 | someName | 2016-04-20 09:42:41.213 |
| 3 | anotherName | 2016-04-20 09:43:41.213 |
| ... | ... | ... |
+-----+-------------+-------------------------+
Now, I am trying to create a query, which selects all timestamps since time x and count the amount of times the same name occurs in the result.
As an example, if we would apply this query to the table above, with 2016-04-20 09:40:41.213 as the date from which on it should be counted, the result should look like this:
+-------------+-------+
| name | count |
+-------------+-------+
| someName | 2 |
| anotherName | 1 |
+-------------+-------+
What I have accomplished so far is the following query, which gives me the the names, but not their count:
WITH screenshots AS
(
SELECT * FROM SavedScreenshotsLog
WHERE timestamp > '2016-04-20 09:40:241.213'
)
SELECT s.name
FROM SavedScreenshotsLog s
INNER JOIN screenshots sc ON sc.name = s.name AND sc.timestamp = s.timestamp
ORDER BY s.name
I have browsed through stackoverflow but was not able to find a solution which fits my needs and as I am not very experienced with SQL, I am out of ideas.
You mention one table in your question, and then show a query with two tables. That makes it hard to follow the question.
What you are asking for is a simple aggregation:
SELECT name, COUNT(*)
FROM SavedScreenshotsLog
WHERE timestamp > '2016-04-20 09:40:241.213'
GROUP BY name
ORDER BY COUNT(*) DESC;
EDIT:
If you want "0" values, you can use conditional aggregation:
SELECT name,
SUM(CASE WHEN timestamp > '2016-04-20 09:40:241.213' THEN 1 ELSE 0 END) as cnt
FROM SavedScreenshotsLog
GROUP BY name
ORDER BY cnt DESC;
Note that this will run slower because there is no filter on the dates prior to aggregation.
CREATE TABLE #TEST (name varchar(100), dt datetime)
INSERT INTO #TEST VALUES ('someName','2016-04-20 09:41:41.213')
INSERT INTO #TEST VALUES ('someName','2016-04-20 09:41:41.213')
INSERT INTO #TEST VALUES ('anotherName','2016-04-20 09:43:41.213')
declare #YourDatetime datetime = '2016-04-20 09:41:41.213'
SELECT name, count(dt)
FROM #TEST
WHERE dt >= #YourDatetime
GROUP BY name
I've posted the answer, because using the above query can generate errors in converting the string in where clause into a datetime, it depends on the format of the datetime.

Get rows where value is not a substring in another row

I'm writing recursive sql against a table that contains circular references.
No problem! I read that you can build a unique path to prevent infinite loops. Now I need to filter the list down to only the last record in the chain. I must be doing something wrong though. -edit I'm adding more records to this sample to make it more clear why just selecting the longest record doesn't work.
This is an example table:
create table strings (id int, string varchar(200));
insert into strings values (1, '1');
insert into strings values (2, '1,2');
insert into strings values (3, '1,2,3');
insert into strings values (4, '1,2,3,4');
insert into strings values (5, '5');
And my query:
select * from strings str1 where not exists
(
select * from strings str2
where str2.id <> str1.id
and str1.string || '%' like str2.string
)
I'd expect to only get the last records
| id | string |
|----|---------|
| 4 | 1,2,3,4 |
| 5 | 5 |
Instead I get them all
| id | string |
|----|---------|
| 1 | 1 |
| 2 | 1,2 |
| 3 | 1,2,3 |
| 4 | 1,2,3,4 |
| 5 | 5 |
Link to sql fiddle: http://sqlfiddle.com/#!15/7a974/1
My problem was all around the 'LIKE' comparison.
select * from strings str1
where not exists
(
select
*
from
strings str2
where
str2.id <> str1.id
and str2.string like str1.string || '%'
)