Adding column to table based on whether another column = a specific string - sql

I want to add a column called "Sweep" that contains bools based on whether the "Result" was a sweep or not. So I want the value in the "Sweep" column to be True if the "Result" is '4-0' or '0-4' and False if it isn't.
This is a part of the table:
I tried this:
ALTER TABLE "NBA_finals_1950-2018"
ADD "Sweep" BOOL;
UPDATE "NBA_finals_1950-2018"
SET "Sweep" = ("Result" = '4-0' OR "Result" = '0-4');
But for some reason, when I run this code...:
SELECT *
FROM "NBA_finals_1950-2018"
ORDER BY "Year";
...only one of the rows (last row) has the value True even though there are other rows where the result is a sweep ('4-0' or '0-4') as shown in the picture below.
I don't know why this is happening but I guess there is something wrong with the UPDATE...SET code. Please help.
Thanks in advance.
NOTE: I am using PostgreSQL 13

This would occur if the strings are not really what they look like -- this is often due to spaces at the beginning or end. Or perhaps to hyphens being different, or other look-alike characters.
You just need to find the right pattern. So so with a select. This returns no values:
select *
from "NBA_finals_1950-2018"
where "Result" in ('4-0', '0-4');
You can try:
where "Result" like '%0-4%' or
"Result" like '%4-0%'
But, this should do what you want:
where "Result" like '%4%' and
"Result" like '%0%'
because the numbers are all single digits.
You can incorporate this into the update statement.
Note: double quotes are a bad idea. I would recommend creating tables and columns without escaping the names.

Related

Which one of these two SELECT statment are correct?

I am a little bit confusing and have no idea which one of these two SELECT statments are correct
SELECT Value FROM visibility WHERE site_info LIKE '%site_is_down%';
OR
SELECT Value FROM visibility WHERE site_info = 'site_is_down';
SInce I run both of these I get same result, but I am interesting which one is correct since Value column is VARCHAR data type OR both of these SELECT are incorect ?
Result set running first SELECT
Value
1. 0
Result set running second SELECT
Value
1. 0
The two statements do not do the same thing.
The first statement filters on rows whose site_infos contain string 'site_is_down'. The surrounding '%' are wildcards. So it would match on something like 'It looks like site_is_down right now'.
The second query, with the equality condition, filters on site_info whose content is exactly 'site_is_dow'.
Everything that the second query is also returned by the first query - but the opposite is not true.
Which statement is "correct" depends on your actual requirement.
If both queries are useful for you, I'd use the second query, as it is the simplest, and runs faster.

PostgreSQL - Assign integer value to string in case statement

I need to select one and only 1 row of data based on an ID in the data I have. I thought I had solved this (For details, see my original question and my solution, here: PostgreSQL - Select only 1 row for each ID)
However, I now still get multiple values in some cases. If there is only "N/A" and 1 other value, then no problem.. but if I have multiple values like: "N/A", "value1" and "value2" for example, then my case statement is not sufficient and I get both "value1" and "value2" returned to me. This is the case statement in question:
CASE
WHEN "PQ"."Value" = 'N/A' THEN 1
ELSE 0
END
I need to give a unique integer value to each string value and then the problem will be solved. The question is: how do I do this? My first thought is to somehow convert the character values to ASCII and sum them up.. but I am not sure how to do that and also worried about performance. Is there a way to very simply assign a value to each string so that I can choose 1 value only? I don't care which one actually... just that it's only 1.
EDIT
I am now trying to create a function to add up the ASCII values of each character so I can essentially change my case statement to something like this:
CASE
WHEN "PQ"."Value" = 'N/A' THEN 9999999
ELSE SumASCII("PQ"."Value")
END
Having a small problem with it though.. I have added it as a separate question, here: PostgreSQL - ERROR: query has no destination for result data
EDIT 2
Thanks to #Bohemian, I now have a working solution, which is as follows:
CASE
WHEN "PQ"."Value" = 'N/A' THEN -1
ELSE ('x'||LPAD(MD5("PQ"."Value"),16,'0'))::bit(64)::bigint
END DESC
This will produce a "unique" number for each value:
('x'||substr(md5("PQ"."Value"),1,8))::bit(64)::bigint
Strictly speaking, there is a chance of a collision, but it's very remote.
If the result is "too big", you could try modulus:
<above-calculation> % 10000
Although collisions would then be a 0.01% chance, you should try this formula against all known values to ensure there are no collisions.
If you don't care which value gets picked, change RANK() to ROW_NUMBER(). If you do care, do it anyway, but also add another term after the CASE statement in ORDER BY, separated by a comma, with the logic you want - for example if you want the first value alphabetically, do this:
...
ORDER BY CASE...END, "PQ"."Value")
...

In a hive query can you specify a condition like "where coulm1 is INT"?

I would like to query a hive table only for those rows that have coulmn1 as integer value only. Due to some data corruption, without this check I am getting a lot of junk data, I would like to get rid of that data by applying where column1 is INT kind of condition, but I couldn't find anything like that in hive. Could anyone suggest how I could do it?
Without any example data, I would suggest something very basic like this:
define column X as STRING
check that X = cast(cast(X as INT) as STRING)
You may have to add some tolerance to blank space, zero-padding, etc. depending on the way your "integers" are actually formatted.
Found a solution that is working :
I could add a Double number check like below, anything other than just numbers will make it null. Also, the valid numbers for the column will never cross the Double range.
So we could do something like below I guess:
"select * from table_example where cast(column1 as double) is not null"

SQL - Conditionally joining two columns in same table into one

I am working with a table that contains two versions of stored information. To simplify it, one column contains the old description of a file run while another column contains the updated standard for displaying ran files. It gets more complicated in that the older column can have multiple standards within itself. The table:
Old Column New Column
Desc: LGX/101/rpt null
null Home
Print: LGX/234/rpt null
null Print
null Page
I need to combine the two columns into one, but I also need to delete the "Print: " and "Desc: " string from the beginning of the old column values. Any suggestions? Let me know if/when I'm forgetting something you need to know!
(I am writing in Cache SQL, but I'd just like a general approach to my problem, I can figure out the specifics past that.)
EDIT: the condition is that if substr(oldcol,1,5) = 'desc: ' then substr(oldcol,6)
else if substr(oldcol,1,6) = 'print: ' then substr(oldcol,7) etc. So as to take out the "desc: " and the "print: " to sanitize the data somewhat.
EDIT2: I want to make the table look like this:
Col
LGX/101/rpt
Home
LGX/234/rpt
Print
Page
It's difficult to understand what you are looking for exactly. Does the above represent before/after, or both columns that need combining/merging.
My guess is that COALESCE might be able to help you. It takes a bunch of parameters and returns the first non NULL.
It looks like you're wanting to grab values from new if old is NULL and old if new is null. To do that you can use a case statement in your SQL. I know CASE statements are supported by MySQL, I'm not sure if they'll help you here.
SELECT (CASE WHEN old_col IS NULL THEN new_col ELSE old_col END) as val FROM table_name
This will grab new_col if old_col is NULL, otherwise it will grab old_col.
You can remove the Print: and Desc: by using a combination of CharIndex and Substring functions. Here it goes
SELECT CASE WHEN CHARINDEX(':',COALESCE(OldCol,NewCol)) > 0 THEN
SUBSTRING(COALESCE(OldCol,NewCol),CHARINDEX(':',COALESCE(OldCol,NewCol))+1,8000)
ELSE
COALESCE(OldCol,NewCol)
END AS Newcolvalue
FROM [SchemaName].[TableName]
The Charindex gives the position of the character/string you are searching for.
So you get the position of ":" in the computed column(Coalesce part) and pass that value to the substring function. Then add +1 to the position which indicates the substring function to get the part after the ":". Now you have a string without "Desc:" and "Print:".
Hope this helps.

Problem with MySQL Select query with "IN" condition

I found a weird problem with MySQL select statement having "IN" in where clause:
I am trying this query:
SELECT ads.*
FROM advertisement_urls ads
WHERE ad_pool_id = 5
AND status = 1
AND ads.id = 23
AND 3 NOT IN (hide_from_publishers)
ORDER BY rank desc
In above SQL hide_from_publishers is a column of advertisement_urls table, with values as comma separated integers, e.g. 4,2 or 2,7,3 etc.
As a result, if hide_from_publishers contains same above two values, it should return only record for "4,2" but it returns both records
Now, if I change the value of hide_for_columns for second set to 3,2,7 and run the query again, it will return single record which is correct output.
Instead of hide_from_publishers if I use direct values there, i.e. (2,7,3) it does recognize and returns single record.
Any thoughts about this strange problem or am I doing something wrong?
There is a difference between the tuple (1, 2, 3) and the string "1, 2, 3". The former is three values, the latter is a single string value that just happens to look like three values to human eyes. As far as the DBMS is concerned, it's still a single value.
If you want more than one value associated with a record, you shouldn't be storing it as a comma-separated value within a single field, you should store it in another table and join it. That way the data remains structured and you can use it as part of a query.
You need to treat the comma-delimited hide_from_publishers column as a string. You can use the LOCATE function to determine if your value exists in the string.
Note that I've added leading and trailing commas to both strings so that a search for "3" doesn't accidentally match "13".
select ads.*
from advertisement_urls ads
where ad_pool_id = 5
and status = 1
and ads.id = 23
and locate(',3,', ','+hide_from_publishers+',') = 0
order by rank desc
You need to split the string of values into separate values. See this SO question...
Can Mysql Split a column?
As well as the supplied example...
http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/
Here is another SO question:
MySQL query finding values in a comma separated string
And the suggested solution:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_find-in-set