Assume that we have a table where we have one field for zip code and the rest are binary fields (1 or NULL) with names corresponding to various places. For example, imagining the table has 201 fields with the first field titled "zip code" containing zip codes and the latter being 200 binary value fields titled with city names: Chicago, New York, Houston, etc.
Assume that row one contains zip code 11373. While one could use coalesce to find the first non-null value and return "New York" another value like "Elmhurst" may also be true.
zip_code new_york chicago elmhurst dover maspeth
10001 1 NULL NULL NULL NULL
07801 NULL NULL NULL 1 NULL
11373 1 NULL 1 NULL 1
The goal is to map the column names to each respective zip code and get an output like so:
zip_code city
10001 new_york
07801 dover
11373 new_york
11373 elmhurst
11373 maspeth
Any help is much appreciated.
This is a great use case for SQL UNPIVOT:
SELECT unpvt.*
FROM
#x UNPIVOT (v FOR statename IN (new_york, chicago,elmhurst, dover, maspeth)) AS unpvt
One method uses union all:
select zip_code, 'New York' as city from t where new_york = 1
union all
select zip_code, 'Chicago' as city from t where chicago = 1
union all
. . .
Related
I have trouble understanding row flags. The below question can clear it for me:
Is it possible to store a name and its flag in the same cell in SQL?
Consider:
If you have a table known as cars with the columns number_plate, colour, and brand_name. The brand_name has a name and a flag.
How would one store that in a single column? If it is not possible or advised, explain why and how to do it.
How would you then get the number of cars from a given country (based on the unique number_plate(primary key)) and the country flag?
I think you are trying to design a schema but haven't quite got the hang of foreign keys.
In your example, you'd have the following tables:
country:
country_id name continent
-----------------------------------
1 Germany Europe
2 Japan Asia
3 USA N.America
Brand
Brand_id name country_id (foreign key)
---------------------------------------------
1 Mercedes 1
2 Toyota 2
3 BMW 1
4 Chrysler 3
Car
Number_plate colour brand_id
------------------------------------------
xxx-yy-zz Green 1
aa-bb-cc Red 1
kkk-l-mmm Orange 2
....
To find the number of cars, based on the country where the brand is based, you'd do something like:
select country.name,
count(*)
from car
inner join brand on car.brand_id = brand.brand_id
inner join country on brand.country_id = country.country_id
group by country.name
Let's say name and flag are two separate columns. Using concat function they can be stored into a single column named brand_name.
select number_plate, colour, concat(name,' ',flag) as brand_name from cars
To get the count of cars(unique) based on a flag
select * from
(select
distinct number_plate,
colour,
concat(name,' ',flag) as brand_name from cars
) a
where brand_name like '%UK%'
Demo
I need to create a report(SSRS) ,but I can't figure out how i can get one specific value.
Please consider the following scenario: I Have a Hierarchy Country ->City->Person ,& i need to get the value of the wealth in one column for a specific person ,a specific city & a specific country.
the wealth of a counry is not the SUM of the wealth of the persons living in the country .
My report should look like :
USA : 20M$
NY: 3M$
Person1: 0,1M$
Person2: 0,2M$
Boston: 2M$
Person 3: 0,5M$
................
My query is :
;WITH wealth_TEMP AS
(
SELECT
fo.wealthId,
SUM(fo.wealth) AS wealth
FROM
(
SELECT
wealthId,
CASE
WHEN (Some Criteria)
THEN 1
ELSE 0
END AS wealth
FROM wealthTable f
) fo
GROUP BY wealthId
),
Hierarchy AS
(
SELECT
B.CityId AS CityId,
B.CountryName AS CountryName
FROM HierarchyTable AS B where
b.CountryName IN (#Country)
),
Person_Table AS
(
SELECT
fsu.individualId,
fsu.Cityid,
fo.welth
FROM Individual_Table FSU
LEFT OUTER JOIN wealth_TEMP fo ON fo.wealthId = fsu.individualId
WHERE bu.Cityname IN (#City)
)
Select ......................
Considering that your dataset returns something like it:
Country CountryWealth State StateWealth Person PersonWealth
USA 20M$ NY 3M$ Person 1 0,1M$
USA 20M$ NY 3M$ Person 2 0,2M$
USA 20M$ Boston 2M$ Person 3 0,5M$
Create a table with 5 columns and remove the header line
Insert tow new details line
In the first line and first column configure to the Country and Hide Duplicates to true
In the first line and second column configure to the CountryWealth and Hide Duplicates to true
In the second line and second column configure to the State and Hide Duplicates to true
In the second line and third column configure to the StateWealth and Hide Duplicates to true
In the third line and third to the end configure the others values
Order the table to Country, State, person
May, you need to add more columns and merge to get your desired lyout
Hello i am new to teradata. I am loading flat file into my TD DB using fast load.
My data set(CSV FILE) contains some issues like some of the rows in city column contains proper data but some of the rows contains NULL. The values of the city columns which contains NULL are stored into the next column which is zip code and so on. At the end some of the rows contains extra columns due to the extra NULL in rows. Examples is given below. How to resolve these kind of issues in fastload? Can someone answer this with SQL example?
City Zipcode country
xyz 12 Esp
abc 11 Ger
Null def(city's data) 12(zipcode's data) Por(country's data)
What about different approach. Instead of solving this in fast load, load your data to temporary table like DATABASENAME.CITIES_TMP with structure like below
City | zip_code | country | column4
xyz | 12 | Esp |
NULL | abc | 12 | Por
In next step create target table DATABASENAME.CITY with the structure
City | zip_code | country |
As a final step you need to run 2 INSERT queries:
INSERT INTO DATABASENAME.CITY (City, zip_code, country)
SELECT City, zip_code, country FROM DATABASENAME.CITIES_TMP
WHERE CITY not like 'NULL'/* WHERE CITY is not null - depends if null is a text value or just empty cell*/;
INSERT INTO DATABASENAME.CITY (City, zip_code, country)
SELECT Zip_code, country, column4 FROM DATABASENAME.CITIES_TMP
WHERE CITY like 'NULL' /* WHERE CITY is null - depends if null is a text value or just empty cell*/
Of course this will work if all your data looks exacly like in sample you provide.
This also will work only when you need to do this once in a while. If you need to load data few times a day it will be a litte cumbersome (not sure if I used proper word in this context) and then you should build some kind of ETL process with for example Talend tool.
I have a tricky flat file data source. The data is grouped, like this:
Country City
U.S. New York
Washington
Baltimore
Canada Toronto
Vancouver
But I want it to be this format when it's loaded in to the database:
Country City
U.S. New York
U.S. Washington
U.S. Baltimore
Canada Toronto
Canada Vancouver
Anyone has met such a problem before? Got a idea to deal with it?
The only idea I got now is to use the cursor, but the it is just too slow.
Thank you!
The answer by cha will work, but here is another in case you need to do it in SSIS without temporary/staging tables:
You can run your dataflow through a Script Transformation that uses a DataFlow-level variable. As each row comes in the script checks the value of the Country column.
If it has a non-blank value, then populate the variable with that value, and pass it along in the dataflow.
If Country has a blank value, then overwrite it with the value of the variable, which will be last non-blank Country value you got.
EDIT: I looked up your error message and learned something new about Script Components (the Data Flow tool, as opposed to Script Tasks, the Control Flow tool):
The collection of ReadWriteVariables is only available in the
PostExecute method to maximize performance and minimize the risk of
locking conflicts. Therefore you cannot directly increment the value
of a package variable as you process each row of data. Increment the
value of a local variable instead, and set the value of the package
variable to the value of the local variable in the PostExecute method
after all data has been processed. You can also use the
VariableDispenser property to work around this limitation, as
described later in this topic. However, writing directly to a package
variable as each row is processed will negatively impact performance
and increase the risk of locking conflicts.
That comes from this MSDN article, which also has more information about the Variable Dispenser work-around, if you want to go that route, but apparently I mislead you above when I said you can set the value of the package variable in the script. You have to use a variable that is local to the script, and then change it in the Post-Execute event handler. I can't tell from the article whether that means that you will not be able to read the variable in the script, and if that's the case, then the Variable Dispenser would be the only option. Or I suppose you could create another variable that the script will have read-only access to, and set its value to an expression so that it always has the value of the read-write variable. That might work.
Yes, it is possible. First you need to load the data to a table with an IDENTITY column:
-- drop table #t
CREATE TABLE #t (id INTEGER IDENTITY PRIMARY KEY,
Country VARCHAR(20),
City VARCHAR(20))
INSERT INTO #t(Country, City)
SELECT a.Country, a.City
FROM OPENROWSET( BULK 'c:\import.txt',
FORMATFILE = 'c:\format.fmt',
FIRSTROW = 2) AS a;
select * from #t
The result will be:
id Country City
----------- -------------------- --------------------
1 U.S. New York
2 Washington
3 Baltimore
4 Canada Toronto
5 Vancouver
And now with a bit of recursive CTE magic you can populate the missing details:
;WITH a as(
SELECT Country
,City
,ID
FROM #t WHERE ID = 1
UNION ALL
SELECT COALESCE(NULLIF(LTrim(#t.Country), ''),a.Country)
,#t.City
,#t.ID
FROM a INNER JOIN #t ON a.ID+1 = #t.ID
)
SELECT * FROM a
OPTION (MAXRECURSION 0)
Result:
Country City ID
-------------------- -------------------- -----------
U.S. New York 1
U.S. Washington 2
U.S. Baltimore 3
Canada Toronto 4
Canada Vancouver 5
Update:
As Tab Alleman suggested below the same result can be achieved without the recursive query:
SELECT ID
, COALESCE(NULLIF(LTrim(a.Country), ''), (SELECT TOP 1 Country FROM #t t WHERE t.ID < a.ID AND LTrim(t.Country) <> '' ORDER BY t.ID DESC))
, City
FROM #t a
BTW, the format file for your input data is this (if you want to try the scripts save the input data as c:\import.txt and the format file below as c:\format.fmt):
9.0
2
1 SQLCHAR 0 11 "" 1 Country SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\r\n" 2 City SQL_Latin1_General_CP1_CI_AS
I want to do some statistic for the Point in my appliation,this is the columns for Point table:
id type city
1 food NewYork
2 food Washington
3 sport NewYork
4 food .....
Each point belongs to a certain type and located at the certain city.
Now I want to caculate the numbers of points in different city for each type.
For example, there are two types here :food and sport.
Then I want to know:
how many points of `food` and `sport` at NewYork
how many points of `food` and `sport` at Washington
how many points of `food` and `sport` at Chicago
......
I have tried this:
select type,count(*) as num from point group by type ;
But I can not group the by the city.
How to make it?
Update
id type city
1 food NewYork
2 sport NewYork
3 food Chicago
4 food San
And I want to get something like this:
NewYork Chicago San
food 2 1 1
sport 1 0 0
I will use the html table and chart to display these datas.
So I need to do the counting, I can use something like this:
select count(*) from point where type='food' and city ='San'
select count(*) from point where type='food' and city ='NewYork'
....
However I think this is a bad idea,so I wonder if I can use the sql to do the counting.
BTW,for these table data,how do people organization their structure using json?
this's what you want:
SELECT city,
COUNT(CASE WHEN [type] = 'food' THEN 1 END) AS FoodCount,
COUNT(CASE WHEN [type] = 'sport' THEN 1 END) AS SportCount
FROM point
GROUP BY city
UPDATE:
To get the results in an aggregated row/column format you need to use a pivot table. In Access it's called a Crosstab query. You can use the Crosstab query wizard to generate the query via a nice UI or cut straight to the SQL:
TRANSFORM COUNT(id) AS CountOfId
SELECT type
FROM point
GROUP BY type
PIVOT city
The grouping is used to count the number of Id's for each type. The additional PIVOT clause groups the data by city and displays each grouping in a separate column. The end result looks something like this:
NewYork Chicago San
food 2 1 1
sport 1 0 0