Thank you in advance! I'm a novice, so plain and simple explanations would greatly be appreciated. I really don't know what I'm doing, but I have grand concepts w/o any idea how to execute them.
I am creating an append query in Access 2010 and I want to identify Col_C as yes, if the word "red" appears in Col_B and everything else as no.
Desired Result
#Col A Col B Col C
1 red y
2 blue n
3 red y
4 green n
5 red blue y
I don't know how to write the if then statement.
INSERT into TABLE1
SELECT [TABLE2].[Col_A], [TABLE2].[CoL_B]
IF [TABLE2].[Col_B] like "*red`*`" then add y to Col_C else n
FROM [TABLE2]
I know it's a bad attempt, but I've been researching and the explanations aren't clear enough for me to proceed.
It could be:
INSERT INTO
TABLE1
([Col_A], [CoL_B], [Col_C])
SELECT
[TABLE2].[Col_A],
[TABLE2].[CoL_B],
IIF([TABLE2].[Col_B] = "red", "y", "n")
FROM
[TABLE2]
Related
I am new to SQL and working on a database that needs a binary indicator based on the presence of string values in a column. I'm trying to make a new table as follows:
Original:
Indicator
a, b, c
c, d, e
Desired:
Indicator
type
a, b, c
1
c, d, e
0
SQL code:
SELECT
ID,
Contract,
Indicator,
CASE
WHEN Indicator IN ('a', 'b')
THEN 1
ELSE 0
END as Type
INTO new_table
FROM old_table
The table I keep creating reports every type as 0.
I also have 200+ distinct indicators, so it will be really time-consuming to write each as:
CASE
WHEN Indicator = 'a' THEN '1'
WHEN Indicator = 'b' THEN '1'
Is there a more streamlined way to think about this?
Thanks!
I think the first step is to understand why your code doesn’t work right now.
If your examples of what’s Indicator column are literally the strings you noted (a, b, c in one string and c, d, e in another) you should understand that your case statement is saying “I am looking for an exact match on the full value of Indicator against the following list -
The letter A or
The letter B
Essentially- you are saying “hey SQL, does ‘a,b,c’ match to ‘a’? Or does ‘a,b,c’ match to ‘b’. ?”
Obviously SQL’s answer is “these don’t match” which is why you get all 0s.
You can try wildcard matching with the LIKE syntax.
Case when Indicator like ‘%a%’ or Indicator like ‘%b%’ then 1 else 0 end as Type
Now, if the abc and cde strings aren’t REALLY what’s in your database then this approach may not work well for you.
Example, let’s say your real values are words that are all slapped together in a single string.
Let’s say that your strings are 3 words each.
Cat, Dog, Man
Catalog, Stick, Shoe
Hair, Hellcat, Belt
And let’s say that Cat is a value that should cause Type to be 1.
If you write: case when Indicator like ‘%cat%’ then 1 else 0 end as Type - all 3 rows will get a 1 because the wildcard will match Cat in Catalog and cat in Hellcat.
I think the bottom line is that unless your Indicator values really are 3 letters and your match criteria is a single letter, you very well could be better off writing a 200 line long case statement if you need this done any time soon.
A better approach to consider (depending on things like are you going to have 300 different combinations a week or month or year from now?)
If yes, wouldn’t it be nice if you had a table with a total of 6 rows - like so?
Indicator | Indictor_Parsed
a,b,c | a
a,b,c | b
a,b,c | c
c,d,e | c
c,d,e | d
c,d,e | e
Then you could write the query as you have it case when Indicator_Parsed in (‘a’, ‘b’) then 1 else 0 end as Type - as a piece of a more verbose solution.
If this approach seems useful to you, here’s a link to the page that lets you parse those comma-separated-values into additional rows. Turning a Comma Separated string into individual rows
ON mysql/sql server You can do it as follows :
insert into table2
select Indicator,
CASE WHEN Indicator like '%a%' or Indicator like '%b%' THEN 1 ELSE 0 END As type
from table1;
demo here
You can use the REGEXP operator to check for presence of either a, b or both.
SELECT Indicator,
Indicator REGEXP '.*[ab].*'
FROM tab
If you need that into a table, you either create it from scratch
CREATE your_table AS
SELECT Indicator,
Indicator REGEXP '.*[ab].*'
FROM tab
or you insert values in it:
INSERT INTO your_table
SELECT Indicator,
Indicator REGEXP '.*[ab].*'
FROM tab
Check the demo here.
i am so green in SQL that I don't even know how to properly phrase my question or look for an existing answer in stack overflow or anywhere else. Sorry!
Assume i have 3 columns. One is an ID and two data columns A and B. A single ID can have multiple entries. I like to remove all entries, where A and B are same for a given ID. Probably i give an example
ID
A
B
01
x
y
01
x
y
01
x
y
02
x
y
02
x
z
02
x
y
In this table I would like to remove all 3 entries that belong to ID 01 as A as well as B are all x and y, respectively. For ID 02, however, column B differs for the first and second entry. Therefore I like to keep ID 02. I hope this illustrates the idea sufficiently :-).
I am look for a 'scalable' solution, as I am not only looking at two data columns A and B, but actually 4 different columns.
Does anyone know how to set a proper filter in SQL to remove those entries according to my needs?
Many thanks.
Benjamin
As for this, it basically doesn't matter how many coumns you actually have, as long as they are identical
this can be used for an as joining basis for a DELETE
WITH CTE AS
(SELECT DISTINCT "ID", "A", "B" FROM tab1),
CTE2 AS (SELECT "ID", COUNT(*) count_ FROM CTE GROUP BY "ID" HAVING COUNT(*) >1)
SELECT "ID" FROM CTE2
| ID |
| -: |
| 2 |
db<>fiddle here
I am learning SQL and doing some practice. I cannot give you the exact scenario because the website I'm practicing on don't want any solution to be found directly on the web so I'll explain the situation in other words. The question I am stuck is using a table XYZ with columns X, Y, and Z. Column X can have duplicates and column Z also. what I need to find is the X's that always have the same value in Z. So
X Y Z
1 ? a
1 ? a
2 ? b
2 ? c
3 ? c
3 ? a
would return me 1 because when X is 1 Z is always a.
My real problem is that I feel I am missing some SQL knowledge in order to achiev this. I would appreciate it if anyone can give me a hint, not a solution but maybe a link to the the SQL knowledge im missing or and brief explanation of the SQL statement that could make me do this.
Otherwise have a nice day.
David.
edit: SELECT X FROM XYZ GROUP BY X HAVING COUNT(DISTINCT Z) = 1 worked and I understand it well. Now what I cannot understand is how to add the Z column to the resultset.
select x, min(z)
from tab
group by x
having min(z) = max(z)
-- or
having count(distinct z) = 1
There are several SQL simple functions that could be used to accomplish this.
DISTINCT MSDN http://technet.microsoft.com/en-us/library/ms187831(v=SQL.105).aspx
GROUP BY http://technet.microsoft.com/en-us/library/ms177673.aspx
I have the following table with 3 columns, this is just a sample (not a real one)
Number Decription Value
1 Green 100
1 Yellow 101
1 Blue 102
2 Chair 200
2 Table 101
2 Green 150
3 Car 200
3 plane 205
3 green 105
My first query is to find any record or row which contain value "101" No problemo in this scenario I find 2 records [1, Yellow,101] and [2, table, 101] and all the records with number 3 and the rest are ignored just perfect. Now I need to select other records based on the RESULT of the FIRST query, the number 1 and 2 are the true results. So from column [number] in this case I found [1 and 2], I want to search and add the value of any description = [green]. Still ignoring [3] which has NO [101] value.
The ideal result I want to display is
[1, yellow, 101] and Green is 100
[2, table, 101] and Green is 150
I have got a headache to get it work so far NO good result. If anyone has a any idea how to make the script for this case please let me know. I hope the questions is clear.
P.S the content of the table is fake just to get an impression what is about and fyi it's SQL + PHP.
Try this:
select
concat('[', concat_ws(', ', s1.number, s1.description, s1.value), '] and ', s2.description, ' is ', s2.value)
from
sample s1
inner join
sample s2 on s2.number = s1.number
where
s1.value = 101 and
s2.description = 'Green'
SQL Fiddle Demo
Try this. I think this is what you are trying to do.
edit. Are you trying to only get the "Green"?
SELECT * from table WHERE table.Number IN (
SELECT Number FROM table WHERE table.Value = 101
) AND table.Description = "Green";
I have seen that thx a alot but it shows blanco fields :(.. however I have done many tests this morning after so much coffee I think what I am trying to do doesn't make sens. Because if I find the record [1, yellow, 101] based on the condition value "101" then it is impossible to add the content-field [Green] as I wanted, because number[1] matches both [101] and [Green].
I tried "OR" but it finds also the record [3 green, 105] which is not good because 3 has no [101] data. I know this is crazy puzzle. So now I want to add another Table which will have a UNIQUE number 1,2 and 3 and that number should correspond to the 1st column of the table sample only sample has duplicate of 1's 2' and 3's etc. If I do that would be possible to have a workable condition, where I can display data in one row or list as the following ==> 1, yellow, 101, and Green, 100. I hate to give up :(
What I am trying to achieve is straightforward, however it is a little difficult to explain and I don't know if it is actually even possible in postgres. I am at a fairly basic level. SELECT, FROM, WHERE, LEFT JOIN ON, HAVING, e.t.c the basic stuff.
I am trying to count the number of rows that contain a particular letter/number and display that count against the letter/number.
i.e How many rows have entries that contain an "a/A" (Case insensitive)
The table I'm querying is a list of film names. All I want to do is group and count 'a-z' and '0-9' and output the totals. I could run 36 queries sequentially:
SELECT filmname FROM films WHERE filmname ilike '%a%'
SELECT filmname FROM films WHERE filmname ilike '%b%'
SELECT filmname FROM films WHERE filmname ilike '%c%'
And then run pg_num_rows on the result to find the number I require, and so on.
I know how intensive like is and ilike even more so I would prefer to avoid that. Although the data (below) has upper and lower case in the data, I want the result sets to be case insensitive. i.e "The Men Who Stare At Goats" the a/A,t/T and s/S wouldn't count twice for the resultset. I can duplicate the table to a secondary working table with the data all being strtolower and working on that set of data for the query if it makes the query simpler or easier to construct.
An alternative could be something like
SELECT sum(length(regexp_replace(filmname, '[^X|^x]', '', 'g'))) FROM films;
for each letter combination but again 36 queries, 36 datasets, I would prefer if I could get the data in a single query.
Here is a short data set of 14 films from my set (which actually contains 275 rows)
District 9
Surrogates
The Invention Of Lying
Pandorum
UP
The Soloist
Cloudy With A Chance Of Meatballs
The Imaginarium of Doctor Parnassus
Cirque du Freak: The Vampires Assistant
Zombieland
9
The Men Who Stare At Goats
A Christmas Carol
Paranormal Activity
If I manually lay out each letter and number in a column and then register if that letter appears in the film title by giving it an x in that column and then count them up to produce a total I would have something like this below. Each vertical column of x's is a list of the letters in that filmname regardless of how many times that letter appears or its case.
The result for the short set above is:
A x x xxxx xxx 9
B x x 2
C x xxx xx 6
D x x xxxx 6
E xx xxxxx x 8
F x xxx 4
G xx x x 4
H x xxxx xx 7
I x x xxxxx xx 9
J 0
K x 0
L x xx x xx 6
M x xxxx xxx 8
N xx xxxx x x 8
O xxx xxx x xxx 10
P xx xx x 5
Q x 1
R xx x xx xxx 7
S xx xxxx xx 8
T xxx xxxx xxx 10
U x xx xxx 6
V x x x 3
W x x 2
X 0
Y x x x 3
Z x 1
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 x x 1
In the example above, each column is a "filmname" As you can see, column 5 marks only a "u" and a "p" and column 11 marks only a "9". The final column is the tally for each letter.
I want to build a query somehow that gives me the result rows: A 9, B 2, C 6, D 6, E 8 e.t.c taking into account every row entry extracted from my films column. If that letter doesn't appear in any row I would like a zero.
I don't know if this is even possible or whether to do it systematically in php with 36 queries is the only possibility.
In the current dataset there are 275 entries and it grows by around 8.33 a month (100 a year). I predict it will reach around 1000 rows by 2019 by which time I will be no doubt using a completely different system so I don't need to worry about working with a huge dataset to trawl through.
The current longest title is "Percy Jackson & the Olympians: The Lightning Thief" at 50 chars (yes, poor film I know ;-) and the shortest is 1, "9".
I am running version 9.0.0 of Postgres.
Apologies if I've said the same thing multiple times in multiple ways, I am trying to get as much information out so you know what I am trying to achieve.
If you need any clarification or larger datasets to test with please just ask and I'll edit as needs be.
Suggestion are VERY welcome.
Edit 1
Erwin Thanks for the edits/tags/suggestions. Agree with them all.
Fixed the missing "9" typo as suggested by Erwin. Manual transcribe error on my part.
kgrittn, Thanks for the suggestion but I am not able to update the version from 9.0.0. I have asked my provider if they will try to update.
Response
Thanks for the excellent reply Erwin
Apologies for the delay in responding but I have been trying to get your query to work and learning the new keywords to understand the query you created.
I adjusted the query to adapt into my table structure but the result set was not as expected (all zeros) so I copied your lines directly and had the same result.
Whilst the result set in both cases lists all 36 rows with the appropriate letters/numbers however all the rows shows zero as the count (ct).
I have tried to deconstruct the query to see where it may be falling over.
The result of
SELECT DISTINCT id, unnest(string_to_array(lower(film), NULL)) AS letter
FROM films
is "No rows found". Perhaps it ought to when extracted from the wider query, I'm not sure.
When I removed the unnest function the result was 14 rows all with "NULL"
If I adjust the function
COALESCE(y.ct, 0) to COALESCE(y.ct, 4)<br />
then my dataset responds all with 4's for every letter instead of zeros as explained previously.
Having briefly read up on COALESCE the "4" being the substitute value I am guessing that y.ct is NULL and being substituted with this second value (this is to cover rows where the letter in the sequence is not matched, i.e if no films contain a 'q' then the 'q' column will have a zero value rather than NULL?)
The database I tried this on was SQL_ASCII and I wondered if that was somehow a problem but I had the same result on one running version 8.4.0 with UTF-8.
Apologies if I've made an obvious mistake but I am unable to return the dataset I require.
Any thoughts?
Again, thanks for the detailed response and your explanations.
This query should do the job:
Test case:
CREATE TEMP TABLE films (id serial, film text);
INSERT INTO films (film) VALUES
('District 9')
,('Surrogates')
,('The Invention Of Lying')
,('Pandorum')
,('UP')
,('The Soloist')
,('Cloudy With A Chance Of Meatballs')
,('The Imaginarium of Doctor Parnassus')
,('Cirque du Freak: The Vampires Assistant')
,('Zombieland')
,('9')
,('The Men Who Stare At Goats')
,('A Christmas Carol')
,('Paranormal Activity');
Query:
SELECT l.letter, COALESCE(y.ct, 0) AS ct
FROM (
SELECT chr(generate_series(97, 122)) AS letter -- a-z in UTF8!
UNION ALL
SELECT generate_series(0, 9)::text -- 0-9
) l
LEFT JOIN (
SELECT letter, count(id) AS ct
FROM (
SELECT DISTINCT -- count film once per letter
id, unnest(string_to_array(lower(film), NULL)) AS letter
FROM films
) x
GROUP BY 1
) y USING (letter)
ORDER BY 1;
This requires PostgreSQL 9.1! Consider the release notes:
Change string_to_array() so a NULL separator splits the string into
characters (Pavel Stehule)
Previously this returned a null value.
You can use regexp_split_to_table(lower(film), ''), instead of unnest(string_to_array(lower(film), NULL)) (works in versions pre-9.1!), but it is typically a bit slower and performance degrades with long strings.
I use generate_series() to produce the [a-z0-9] as individual rows. And LEFT JOIN to the query, so every letter is represented in the result.
Use DISTINCT to count every film once.
Never worry about 1000 rows. That is peanuts for modern day PostgreSQL on modern day hardware.
A fairly simple solution which only requires a single table scan would be the following.
SELECT
'a', SUM( (title ILIKE '%a%')::integer),
'b', SUM( (title ILIKE '%b%')::integer),
'c', SUM( (title ILIKE '%c%')::integer)
FROM film
I left the other 33 characters as a typing exercise for you :)
BTW 1000 rows is tiny for a postgresql database. It's beginning to get large when the DB is larger then the memory in your server.
edit: had a better idea
SELECT chars.c, COUNT(title)
FROM (VALUES ('a'), ('b'), ('c')) as chars(c)
LEFT JOIN film ON title ILIKE ('%' || chars.c || '%')
GROUP BY chars.c
ORDER BY chars.c
You could also replace the (VALUES ('a'), ('b'), ('c')) as chars(c) part with a reference to a table containing the list of characters you are interested in.
This will give you the result in a single row, with one column for each matching letter and digit.
SELECT
SUM(CASE WHEN POSITION('a' IN filmname) > 0 THEN 1 ELSE 0 END) AS "A",
SUM(CASE WHEN POSITION('b' IN filmname) > 0 THEN 1 ELSE 0 END) AS "B",
SUM(CASE WHEN POSITION('c' IN filmname) > 0 THEN 1 ELSE 0 END) AS "C",
...
SUM(CASE WHEN POSITION('z' IN filmname) > 0 THEN 1 ELSE 0 END) AS "Z",
SUM(CASE WHEN POSITION('0' IN filmname) > 0 THEN 1 ELSE 0 END) AS "0",
SUM(CASE WHEN POSITION('1' IN filmname) > 0 THEN 1 ELSE 0 END) AS "1",
...
SUM(CASE WHEN POSITION('9' IN filmname) > 0 THEN 1 ELSE 0 END) AS "9"
FROM films;
A similar approach like Erwins, but maybe more comfortable in the long run:
Create a table with each character you're interested in:
CREATE TABLE char (name char (1), id serial);
INSERT INTO char (name) VALUES ('a');
INSERT INTO char (name) VALUES ('b');
INSERT INTO char (name) VALUES ('c');
Then grouping over it's values is easy:
SELECT char.name, COUNT(*)
FROM char, film
WHERE film.name ILIKE '%' || char.name || '%'
GROUP BY char.name
ORDER BY char.name;
Don't worry about ILIKE.
I'm not 100% happy about using the keyword 'char' as table title, but hadn't had bad experiences so far. On the other hand it is the natural name. Maybe if you translate it to another language - like 'zeichen' in German, you avoid ambiguities.