How to get value from data structure in column

How to get value from data structure in column - sql

I have data in a table and I am using SQL Server as follow:
Number
Value
1
/F10749180509 1/TOYOTA TSUSHO ASIA PACIFIC PTE. L 1/TD. 2/600 NORTH BRIDGE ROAD HEX19 01, P 3/SG/ARKVIEWSQUARE SINGAPORE 188778
2
/0019695051 1/PT ASURANSI ALLIANZ LIFE 1/INDONESIA 2/ALLIANZ TWR 16FL JL.HRRASUNA SAID 3/ID/JAKARTA
As you can see on the table, I need to find Country code from field value. The country code can be found in string after "3/". The example from the first row, I need to get "SG" after 3/ and the second row I need to get "ID" after 3/ and so on. Actually If I copy the first data from value field to notepad, the data separated by new line. The data will be like:
/F10749180509
1/TOYOTA TSUSHO ASIA PACIFIC PTE. L
1/TD.
2/600 NORTH BRIDGE ROAD HEX19 01, P
3/SG/ARKVIEWSQUARE SINGAPORE 188778
Please help to find the query to get country code. Thank you

We might be able to use PATINDEX here along with SUBSTRING. Assuming that the country code would always be exactly two uppercase letters, we can try:
SELECT val, SUBSTRING(val, PATINDEX('% [0-9]/[A-Z][A-Z]/%', val) + 3, 2) AS country_code
FROM yourTable
Demo

You can use CHARINDEX to get the data.
declare #table table(number int, val varchar(8000))
insert into #table
values
(1, '/F10749180509 1/TOYOTA TSUSHO ASIA PACIFIC PTE. L 1/TD. 2/600 NORTH BRIDGE ROAD HEX19 01, P 3/SG/ARKVIEWSQUARE SINGAPORE 188778')
select substring(val,charindex('3/',val,1)+2,2) from #table
SG

Related

When a statement contains an item in a list, show it in a new column

I would appreciate a little help on some script in sql. So I have a list like the one below and a database table -Table1 with statement as a colum name, and I will like to create a column called location, where the script can search in the statement column and once it finds any of the items in the list in any row it states that in the location column
(Tema, london, Sydney, Germany, China, Africa,)
Statement
-------------------
Going to london
Apples in Tema
Sydney is a city
China is a country
Africa is a continent
In the end I hope to see a table like this :
Statement
location
Going to london
London
Apples in Tema
Tema
Sydney is a city
Sydney
china is a country
China
Africa is a continent
Africa
By using this script,
SELECT Statement,
Case
WHEN Statement::text ~~* '%london%'::character varying::text
THEN 'london'::character varying
ELSE NULL::character varying
END AS location
FROM Table1
I think I would have to write a very tall script, but I was wondering if I could get help with something efficient and quite simple to achieve this

If you have a list of places, you can use that:
select t1.*, v.place
from table1 t1 cross join
(values ('tema'), ('london'), ('sydney'), ('germany'), ('china'), ('africa')
) v(place)
on Statement::text ilike '%' || v.place || '%';
Note: You might want to use regular expressions so you can include work boundaries but your example code doesn't do tis.

How to design table relationship where the foreign key can mean "all rows", "some rows" or "one row"?

I hope you can help me with this. I've used pseudocode to keep everything simple.
I have a table which describes locations.
location_table
location = charfield(200) # New York, London, Tokyo
A product manager now wants locations to be as follows:
Global = select every location
Asia = select every location in Asia
US = select every location in US
Current system = London (etc.)
This is my proposed redesign.
location_table
location = charfield(200) # New York, London, Tokyo
continent = foreign key to continent_table
continent_table
continent = charfield(50) # "None", "Global", Asia, Europe
But this seems horrible. It means in my code I'll always need to check if the customer is using "global" or "none", and then select the corresponding location records. For example, there will be code like this scattered everywhere:
get continent
if continent is global, select everything from location_table
else if continent is none, select location from location_table
else select location from location_table where foreign key is continent
My feeling is this is a known problem, and there is a known solution for it. Any ideas?
Thank you.

What you seem to have here is a set of locations, and then a set of location groups. Those groups might be all of the locations (global), or a subset of them.
You can build this with an intermediate table between the locations and a new location sets table which associates locations and location sets.
You might build the location set table and the join table so that the individual locations are also location sets, but ones which join only to one location. That way all location selections come from one table -- the location sets.
So you end up with three different types of location set:
Ones which map 1:1 with a location
One which maps 1:all ("global")
Ones which map 1:many (continents and other areas)
It's conceivable that this could be created as a hierarchy, but those queries can be inefficient because the join cardinalities tend to be obscured from the optimiser.

You could do this using a hierarchy, and a self referencing foreign key, e.g.
LocationID Name ParentLocationID LocationType
------------------------------------------------------------------
1 Planet Earth NULL Planet
2 Africa 1 Continent
3 Antartica 1 Continent
4 Asia 1 Continent
5 Australasia 1 Continent
6 Europe 1 Continent
7 North America 1 Continent
8 South America 1 Continent
9 United States 7 Country
10 Canada 7 Country
11 Mexico 7 Country
12 California 9 State
13 San Diego 12 City
14 England 6 Country
15 Cornwall 14 County
16 Truro 15 City
Hierarchical data usually requires either recursion, or multiple joins to get all levels, this answer contains links to articles comparing performance on the major DBMS.
Many DBMS now support recursive Common table expressions, and since no DBMS is specified I will use SQL Server syntax because it is what I am most comfortable with, a quick example would be.
DECLARE #LocationID INT = 7; -- NORTH AMERICA
WITH LocationCTE AS
( SELECT l.LocationID, l.Name, l.ParentLocationID, l.LocationType
FROM dbo.Location AS l
WHERE LocationID = #LocationID
UNION ALL
SELECT l.LocationID, l.Name, l.ParentLocationID, l.LocationType
FROM dbo.Location AS l
INNER JOIN LocationCTE AS c
ON c.LocationID = l.ParentLocationID
)
SELECT *
FROM LocationCTE;
Output based on above sample data
LocationID Name ParentLocationID LocationType
-----------------------------------------------------------------
7 North America 1 Continent
9 United States 7 Country
10 Canada 7 Country
11 Mexico 7 Country
12 California 9 State
13 San Diego 12 City
Online Demo
Supplying a value of 1 (Planet Earth) for the location ID will return the full table, or supplying a locationID of 11 (Mexico) would only return this one row, because there is nothing smaller than this in the sample data.

I'll go with your answer and say that I don't find it quite horrible to look everytime a customer to check if he searches by city or location, or nothing. That would be the role of the backend code and would always lead to different queries depending on what option he chooses.
But I would remove "None", "Global" from the continent table, and just use other queries when these option are not chosen. You would end up with the 3 possibles SQL queries you have, and I don't find it to be bad design per se. Maybe other solution are more performant, but this one seems to be more readable and logical. It's just optional querying with join tables.
Other answer will trade performance/duplication for readability (which isn't a bad thing, depending on how many time you will be relying on this condition in your application, in how many queries you'll be using it, and how many cities you have).
For readability and non-repetition, the best thing would be to concentrate these condition in one SQL function wich take a string parameter and return all location depending on the input (but at the cost of preformance).

Use levels:
0 -> None
00 -> Global
001 -> Europe
002 -> Asia
003 -> Africa
select location from location_table where continent like '[value]%'
Using a fixed length code, you can prefix regions, and then add one more digit for a region inside a region, and so on.
Ok, let me try to improve it.
Consider the world, it has the minimum level (or maximum depending on how you see it)
World ID = '0' (1 digit)
Now, select how you want to divide the world: (Continents, Half-Continents, ...) and assign the next level.
Europe ID = '01' (First digit World + Second digit Europe)
Asia ID = '02'
America ID = '03'
...
Next Level: Countries. (At least 2 digits)
England ID = '0101' (World + Continent + Country)
Deutchland ID = '0102'
....
Texas ID = '0301'
....
Next Level: Regions (2 digits)
Yorkshire ID = '010101' (World + Continent + Country + Region)
....
Next Level: Cities (2 or 3 digits)
London ID = '01010101' (World + Continent + Country + Region + City)
And so on.
Now, the same SELECT some_aggregate, statistics, ... FROM ... can be used for no matter what region, simply change:
WHERE Region like '0%' --> The whole world
WHERE Region like '02%' --> Asia
WHERE Region like '01010101%' --> London
WHERE Region like '02%' AND Region like '01%' --> Asia & Europe

How to load grouped data with SSIS

I have a tricky flat file data source. The data is grouped, like this:
Country City
U.S. New York
Washington
Baltimore
Canada Toronto
Vancouver
But I want it to be this format when it's loaded in to the database:
Country City
U.S. New York
U.S. Washington
U.S. Baltimore
Canada Toronto
Canada Vancouver
Anyone has met such a problem before? Got a idea to deal with it?
The only idea I got now is to use the cursor, but the it is just too slow.
Thank you!

The answer by cha will work, but here is another in case you need to do it in SSIS without temporary/staging tables:
You can run your dataflow through a Script Transformation that uses a DataFlow-level variable. As each row comes in the script checks the value of the Country column.
If it has a non-blank value, then populate the variable with that value, and pass it along in the dataflow.
If Country has a blank value, then overwrite it with the value of the variable, which will be last non-blank Country value you got.
EDIT: I looked up your error message and learned something new about Script Components (the Data Flow tool, as opposed to Script Tasks, the Control Flow tool):
The collection of ReadWriteVariables is only available in the
PostExecute method to maximize performance and minimize the risk of
locking conflicts. Therefore you cannot directly increment the value
of a package variable as you process each row of data. Increment the
value of a local variable instead, and set the value of the package
variable to the value of the local variable in the PostExecute method
after all data has been processed. You can also use the
VariableDispenser property to work around this limitation, as
described later in this topic. However, writing directly to a package
variable as each row is processed will negatively impact performance
and increase the risk of locking conflicts.
That comes from this MSDN article, which also has more information about the Variable Dispenser work-around, if you want to go that route, but apparently I mislead you above when I said you can set the value of the package variable in the script. You have to use a variable that is local to the script, and then change it in the Post-Execute event handler. I can't tell from the article whether that means that you will not be able to read the variable in the script, and if that's the case, then the Variable Dispenser would be the only option. Or I suppose you could create another variable that the script will have read-only access to, and set its value to an expression so that it always has the value of the read-write variable. That might work.

Yes, it is possible. First you need to load the data to a table with an IDENTITY column:
-- drop table #t
CREATE TABLE #t (id INTEGER IDENTITY PRIMARY KEY,
Country VARCHAR(20),
City VARCHAR(20))
INSERT INTO #t(Country, City)
SELECT a.Country, a.City
FROM OPENROWSET( BULK 'c:\import.txt',
FORMATFILE = 'c:\format.fmt',
FIRSTROW = 2) AS a;
select * from #t
The result will be:
id Country City
----------- -------------------- --------------------
1 U.S. New York
2 Washington
3 Baltimore
4 Canada Toronto
5 Vancouver
And now with a bit of recursive CTE magic you can populate the missing details:
;WITH a as(
SELECT Country
,City
,ID
FROM #t WHERE ID = 1
UNION ALL
SELECT COALESCE(NULLIF(LTrim(#t.Country), ''),a.Country)
,#t.City
,#t.ID
FROM a INNER JOIN #t ON a.ID+1 = #t.ID
)
SELECT * FROM a
OPTION (MAXRECURSION 0)
Result:
Country City ID
-------------------- -------------------- -----------
U.S. New York 1
U.S. Washington 2
U.S. Baltimore 3
Canada Toronto 4
Canada Vancouver 5
Update:
As Tab Alleman suggested below the same result can be achieved without the recursive query:
SELECT ID
, COALESCE(NULLIF(LTrim(a.Country), ''), (SELECT TOP 1 Country FROM #t t WHERE t.ID < a.ID AND LTrim(t.Country) <> '' ORDER BY t.ID DESC))
, City
FROM #t a
BTW, the format file for your input data is this (if you want to try the scripts save the input data as c:\import.txt and the format file below as c:\format.fmt):
9.0
2
1 SQLCHAR 0 11 "" 1 Country SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\r\n" 2 City SQL_Latin1_General_CP1_CI_AS

How to get the duplicate row count in postgres sql

In the table data is like this
st_no st_name directions others
1210 6th street Northwest
1210 8th ST NW NULL
Need output like (1210,1210) (6th street ,8th st) Northwest
Any help you can provide is greatly appreciated. And, in case it comes into play, I'm using a Postgresql DB.

try like this
select string_agg(st_no,',') as st_no,
string_agg(st_name,',') as st_name,
string_agg(directions,',') as directions,
string_agg(others,',') as others from table1 group by st_no

SQL Server - copy data across tables , but copy the data only when it match with a specific column name

For example I got this 2 table
dbo.fc_states
StateId Name
6316 Alberta
6317 British Columbia
and dbo.fc_Query
Name StatesName StateId
Abbotsford Quebec NULL
Abee Alberta NULL
100 Mile House British Columbia NULL
Ok pretty straightforward , how do I copy the stateId over from fc_states to fc_Query, but match it with the StatesName, let say the result would be
Name StatesName StateId
Abee Alberta 6316
100 Mile House British Columbia 6317
Thanks, and both stateName column type is text

How about:
update fc_Query set StateId =
(select StateId from fc_states where fc_states.Name = fc_Query.StatesName)
That should give you the result you're looking for.

This is a different way than what Eddie did, I like MERGE for updates if they're not dead simple (like I wouldn't consider yours dead simple). So if you're bored/curious also try
WITH stateIds as
(SELECT name, MAX(stateID) as stID
FROM fc_states
GROUP BY name)
MERGE fc_Query
on stateids.name = fc_query.statesname
WHEN MATCHED THEN UPDATE
SET fc_query.stateid = convert(int, stid)
;
The first part, from "WITH" to the GROUP BY NAME), is a CTE, that creates a table-like thing - a name 'stateIds' that is good as a table for the immediately following part of the query - where there's guaranteed to be only one row per state name. Then the MERGE looks for anything in the fc_query with a matching name. And if there's a match, it sets it as you want. YOu can make a small edit if you don't want to overwrite existing stateids in fc_query:
WITH stateIds as
(SELECT name, MAX(stateID) as stID
FROM fc_states
GROUP BY name)
MERGE fc_Query
ON stateids.name = fc_query.statesname
AND fc_query.statid IS NOT NULL
WHEN MATCHED THEN UPDATE
SET fc_query.stateid = convert(int, stid)
;
And you can have it do something different to rows that don't match. So I think MERGE is good for a lot of applications. You need a semicolon at the end of MERGE statements, and you have to guarantee that there will only be one match or zero matches in the source (that is "stateids", my CTE) for each row in the target; if there's more than one match some horrible thing happens, Satan wins or the US economy falters, I'm not sure what, just never let it happen.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas