I am trying to write sql that will create a new table using data from an existing table.
The new table will have, among other columns, two columns like so:
reserved boolean
reserved_for character varying(10)
In the original table, I have data that looks like this:
id | identification | department | description | lastchange | available | type
------------+----------------+------------------------------------+-------------------------------------------+---------------------+-----------+--------
9145090050 | | | Reserved for llb | 2011-05-20 11:46:21 | f |
9145090096 | | | Reserved for ppa | 2013-01-26 12:31:56 | f |
9145090046 | | | | 2011-05-06 10:34:21 | f |
If the original table has the text "Reserved for ..." then, I want the reserved field in the new table to be set to "true" and reserved_for to contain the 3 or 4 characters that follow the "Reserved for" text in the original table.
So using the above table as an example, I want my new table to look like this;
id | reserved | reserved_for | lastchange |
------------+----------+----------------+---------------------+
9145090050 | true | llb | 2011-05-20 11:46:21 |
9145090096 | true | ppa | 2013-01-26 12:31:56 |
9145090046 | false | | 2011-05-06 10:34:21 |
The query I have to extract the 4 characters after the "Reserved for " looks like this:
select
substring(description from 13 for 4)
from
definition
where
description like 'Reserved for%';
It works in that it extracts the characters I need. How do I write the conditional statement in my create table command?
I think this is just some string manipulation on the original table:
select id,
(description like 'Reserved for%') as Reserved,
(case when description like 'Reserved for%'
then substring(description from 14)
end) as Reserved_For,
last_change
from original;
Perhaps this will help..
insert into yourtable(id,reserved,reserved_for,lastchange)
select t1.id,
case when t1.description like 'Reserved for%' then true else false end as reserved,
case when when t1.description like 'Reserved for%' then substr('reserved for llb',strpos('reserved for llb','for') + 4) else null end as reserved_for,
lastchange
from table t1
Related
I want to clean a dataset because there are repeated keys that should not be there. Although the key is repeated, other fields do change. On repetition, I want to keep those entries whose country field is not null. Let's see it with a simplified example:
| email | country |
| 1#x.com | null |
| 1#x.com | PT |
| 2#x.com | SP |
| 2#x.com | PT |
| 3#x.com | null |
| 3#x.com | null |
| 4#x.com | UK |
| 5#x.com | null |
Email acts as key, and country is the field which I want to filter by. On email repetition:
Retrieve the entry whose country is not null (case 1)
If there are several entries whose country is not null, retrieve one of them, the first occurrence for simplicity (case 2)
If all the entries' country is null, again, retrieve only one of them (case 3)
If the entry key is not repeated, just retrieve it no matter what its country is (case 4 and 5)
The expected output should be:
| email | country |
| 1#x.com | PT |
| 2#x.com | SP |
| 3#x.com | null |
| 4#x.com | UK |
| 5#x.com | null |
I have thought of doing a UNION or some type of JOIN to achieve this. One possibility could be querying:
SELECT
...
FROM (
SELECT *
FROM `myproject.mydataset.mytable`
WHERE country IS NOT NULL
) AS a
...
and then match it with the full table to add those values which are missing, but I am not able to imagine the way since my experience with SQL is limited.
Also, I have read about the COALESCE function and I think it could be helpful for the task.
Consider below approach
select *
from `myproject.mydataset.mytable`
where true
qualify row_number() over(partition by email order by country nulls last) = 1
I'm struggling to find a value that might be in different tables but using UNION is a pain as there are a lot of tables.
[Different table that contains the suffixes from the TestTable_]
| ID | Name|
| -------- | -----------|
| 1 | TestTable1 |
| 2 | TestTable2 |
| 3 | TestTable3 |
| 4 | TestTable4 |
TestTable1 content:
| id | Name | q1 | a1 |
| -------- | ---------------------------------------- |
| 1 | goose | withFeather? |featherID |
| 2 | rooster| withoutFeather?|shinyfeatherID |
| 3 | rooster| age | 20 |
TestTable2 content:
| id | Name | q1 | a1 |
| -------- | ---------------------------------------------------|
| 1 | brazilian_goose | withFeather? |featherID |
| 2 | annoying_rooster | withoutFeather?|shinyfeatherID |
| 3 | annoying_rooster | no_legs? |dead |
TestTable3 content:
| id | Name | q1 | a1 |
| -------- | ---------------------------------------- |
| 1 | goose | withFeather? |featherID |
| 2 | rooster| withoutFeather?|shinyfeatherID |
| 3 | rooster| age | 15 |
Common columns: q1 and a1
Is there a way to parse through all of them to lookup for a specific value without using UNION because some of them might have different columns?
Something like: check if "q1='age'" exists in all those tables (from 1 to 50)
Select q1,*
from (something)
where q1 exists in (TestTable_*)... or something like that.
If not possible, not a problem.
You could use dynamic SQL but something I do in situations like this where I have a list of tables that I want to quickly perform the same actions on is to either use a spreadsheet to paste the list of tables into and type a query into the cell with something like #table then use the substitute function to replace it.
Alternative I just paste the list into SSMS and use SHIFT+ALT+ArrowKey to select the column and start typing stuff out.
So here is my list of tables
Then I use that key combo. As you can see my cursor has now selected all those rows.
Now I can start typing and all rows selected will get the input.
Then I just go to the other side of the table names and repeat the action
It's not a perfect solution but it's quick a quick and dirty way of doing something repetitive quickly.
If you want to find all the tables with that column name you can use information schema.
Select table_name from INFORMATION_SCHEMA.COLUMNS where COLUMN_NAME = 'q1'
Given the type of solution you are after I can offer a method that I've had to use on legacy systems.
You can query sys.columns for the name of the column(s) you need to find in N tables and join using object_id to sys.tables where type='U'. This will give you a list of table names.
From this list you can then build a working query for each table, and depending on your requirements (is this ad-hoc?) either just manually execute it yourself of build a procedure that will do it for you using sp_executesql
Eg
select t.name, c.name
into #workingtable
from sys.columns c
join sys.tables t on t.object_id=c.object_id
where c.name in .....
psudocode:
begin loop while rows exist in #working table
select top 1 row from #workingtable
set #sql=your query specific to that table and column(s)
exec(#sql) / sp_executesql / try/catch as necessary
delete row from working table
end loop
Hopefully that give ideas at least for how you might implement your requirements.
I am having a problem creating VIEWS with Snowflake that has VARIANT field which stores JSON data whose keys are dynamic and keys definition is stored in another table. So I want to create a VIEW that has dynamic columns based on the foreign key.
Here are my table looks like:
companies:
| id | name |
| -- | ---- |
| 1 | Company 1 |
| 2 | Company 2 |
invoices:
| id | invoice_number | custom_fields | company_id |
| -- | -------------- | ------------- | ---------- |
| 1 | INV-01 | {"1": "Joe", "3": true, "5": "2020-12-12"} | 1 |
| 2 | INV-01 | {"2":"Hello", "4": 1000} | 2 |
customization_fields:
| id | label | data_type | company_id |
| -- | ----- | --------- | ---------- |
| 1 | manager | text | 1 |
| 2 | reference | text | 2 |
| 3 | emailed | boolean | 1 |
| 4 | account | integer | 2 |
| 5 | due_date | date | 1 |
So I want to create a view for getting each companies invoices something like:
CREATE OR REPLACE VIEW companies_invoices AS SELECT * FROM invoices WHERE company_id = 1
which should get a result like below:
| id | invoice_number | company_id | manager | emailed | due_date |
| -- | -------------- | ---------- | ------- | ------- | -------- |
| 1 | INV-01 | 1 | Joe | true | 2020-12-12 |
So my challenge above here is I cannot make sure the keys when I write the query. If I know that I could write
SELECT
id,
invoice_number,
company_id,
custom_fields:"1" AS manager,
custom_fields:"3" AS emailed,
custom_fields:"5" AS due_date
FROM invoices
WHERE company_id = 1
These keys and labels are written in the customization_fields table, so I tried different ways and I am not able to do that.
So could anyone tell me if we can do or not? If we can please give me an example so it would really help.
You cannot do what you want to do with a view. A view has a fixed set of columns and they have specific types. Retrieving a dynamic set of columns requires some other mechanism.
If you're trying to change the number of columns or the names of the columns based on the rows in the customization_fields table, you can't do it in a view.
If you have a defined schema and just need to grab dynamic JSON properties, you may want to consider looking into Snowflake's GET function. It allows you to get any part of a JSON using a string for the path rather than using a literal path in the SQL statement. For example:
create temp table foo(v variant);
insert into foo select parse_json('{ "name":"John", "age":30, "car":null }');
-- This uses a literal path in the SQL to get to a JSON property
select v:name::string as first_name from foo;
-- This uses the GET function to get the value from a path in a string
select get(v, 'name')::string as first_name from foo;
You can replace the 'name' in the second parameter of the GET function with the value stored in the customization_fields table.
In SF, You will have to use a Stored Proc function to retrieve the dynamic set of columns
I have a table "table1" like this:
+------+--------------------+
| id | barcode | lot |
+------+-------------+------+
| 0 | ABC-123-456 | |
| 1 | ABC-123-654 | |
| 2 | ABC-789-EFG | |
| 3 | ABC-456-EFG | |
+------+-------------+------+
I have to extract the number in the center of the column "barcode", like with this request :
SELECT SUBSTR(barcode, 5, 3) AS ToExtract FROM table1;
The result:
+-----------+
| ToExtract |
+-----------+
| 123 |
| 123 |
| 789 |
| 456 |
+-----------+
And insert this into the column "lot" .
follow along the lines
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
i.e in your case
UPDATE table_name
SET lot = SUBSTR(barcode, 5, 3)
WHERE condition;(if any)
UPDATE table1 SET Lot = SUBSTR(barcode, 5, 3)
-- WHERE ...;
Many databases support generated (aka "virtual"/"computed" columns). This allows you to define a column as an expression. The syntax is something like this:
alter table table1 add column lot varchar(3) generated always as (SUBSTR(barcode, 5, 3))
Using a generated column has several advantages:
It is always up-to-date.
It generally does not occupy any space.
There is no overhead when creating the table (although there is overhead when querying the table).
I should note that the syntax varies a bit among databases. Some don't require the type specification. Some use just as instead of generated always as.
CREATE TABLE Table1(id INT,barcode varchar(255),lot varchar(255))
INSERT INTO Table1 VALUES (0,'ABC-123-456',NULL),(1,'ABC-123-654',NULL),(2,'ABC-789-EFG',NULL)
,(3,'ABC-456-EFG',NULL)
UPDATE a
SET a.lot = SUBSTRING(b.barcode, 5, 3)
FROM Table1 a
INNER JOIN Table1 b ON a.id=b.id
WHERE a.lot IS NULL
id | barcode | lot
-: | :---------- | :--
0 | ABC-123-456 | 123
1 | ABC-123-654 | 123
2 | ABC-789-EFG | 789
3 | ABC-456-EFG | 456
db<>fiddle here
I need to update/migrate a table IdsTable in my SQL Server database which has the following format:
+----+------------------+---------+
| id | ids | idType |
+----+------------------+---------+
| 1 | id11, id12, id13 | idType1 |
| 2 | id20 | idType2 |
+----+------------------+---------+
The ids column is a comma separate list of ids. I need to combine the ids and idType column to form a single JSON string for each row and update the ids column with that object.
The JSON object has the following format:
{
"idType": string,
"ids": string[]
}
Final table after transforming/migrating data should be:
+----+-----------------------------------------------------+---------+
| id | ids | idType |
+----+-----------------------------------------------------+---------+
| 1 | {"idType": "idType1","ids": ["id11","id12","id13"]} | idType1 |
| 2 | {"idType": "idType2","ids": ["id20"]} | idType2 |
+----+-----------------------------------------------------+---------+
The best I've figured out so far is to get the results into a format where I could GROUP BY id to try and get the correct JSON format:
SELECT X.id, Y.value, X.idType
FROM
IdsTable AS X
CROSS APPLY STRING_SPLIT(X.ids, ',') AS Y
Which gives me the results:
+----+------+---------+
| id | ids | idType |
+----+------+---------+
| 1 | id11 | idType1 |
| 1 | id12 | idType1 |
| 1 | id13 | idType1 |
| 2 | id20 | idType2 |
+----+------+---------+
But I'm not familiar enough with SQL Server JSON to move forward.
If it's a one-off op I think I'd just do it the dirty way:
UPDATE table SET ids =
CONCAT('{"idType": "', idType, '","ids": ["', REPLACE(ids, ', ', '","'), '"]}'
You might need to do some prep first, like if your ids column can look like:
id1,id2,id3
id4, id5, id6
id7 ,id8 , id9
etc, a series of replacements like:
UPDATE table SET ids = REPLACE(ids, ' ,', ',') WHERE ids LIKE '% ,%'
UPDATE table SET ids = REPLACE(ids, ', ', ',') WHERE ids LIKE '%, %'
Keep running those until they don't update any more records
ps; if you've removed all the spaces from around the commas, you'll need to tweak the REPLACE in the original query - I specified ', ' as the needle
I found this blog post that helped me construct my answer:
-- Create Temporary Table
SELECT
[TAB].[id], [TAB].[ids],
(
SELECT [STRING_SPLIT_RESULTS].value as [ids], [TAB].[idType] as [idType]
FROM [IdsTable] AS [REQ]
CROSS APPLY STRING_SPLIT([REQ].[ids],',') AS [STRING_SPLIT_RESULTS]
FOR JSON PATH
) as [newIds]
INTO [#TEMP_RESULTS]
FROM [IdsTable] AS [TAB]
-- Update rows
UPDATE [IdsTable]
SET [ids] = [#TEMP_RESULTS].[newIds]
FROM [#TEMP_RESULTS]
WHERE [IdsTable].[Id] = [#TEMP_RESULTS].[Id]
-- Delete Temporary Table
DROP TABLE [#TEMP_RESULTS]
Which replaces those ids column (not replaced below for comparison):
+----+----------------+---------+------------------------------------------------------------------------------------------------------+
| id | ids | idType | newIds |
+----+----------------+---------+------------------------------------------------------------------------------------------------------+
| 1 | id11,id12,id13 | idType1 | [{"id":"id11","idType":"idType1"},{"id":"id12","idType":"idType1"},{"id":"id13","idType":"idType1"}] |
| 2 | id20 | idType2 | [{"id":"id20","idType":"idType2"}] |
+----+----------------+---------+------------------------------------------------------------------------------------------------------+
This is more verbose that I wanted but considering the table size and the number of ids stored in the ids column which translates to the size of the JSON object, this is fine for me.