I have two tables that store two sets of ID's which are the same, when any of the four ID's are NULL there's an issue on the front end application. These four values always vary as to which can be NULL but there will always be one with the correct entry.
My question is can I enter these four values into a temp table then update all the NULL values using the column which has actually has a value? As the column with the correct value changes all the time it makes it harder.
Basically i'm making a stored proc but can't figure this logic out.
It sounds like you just need to use coalesce to find the non-NULL value.
coalesce(table1.col1, table1.col2, table2,col1, table2.col2)
The only caveat is that if two columns have different non-NULL values, then this expression returns the first one (in the order you list the columns) it finds. But if you don't have that situation occur, or if you can specify which column you'd use when it does occur, this should work regardless of what combination of columns has NULL.
Using table expression to join two tables then from the result table, update the column missing data with the ones have.
Related
Hey guys, maybe this is a basic SQL qn. Say I have this very simple table, I need to run a simple sql statement to return a result like this:
Basically, the its to dedup Name based on it's row's Value column, whichever is larger should stay.
Thanks!
Framing the problem correctly would help you figure it out.
"Deduplication" suggests altering the table - starting with a state with duplicates, ending with a state without them. Usually done in three steps (getting the rows without duplicates into temp table, removing original table, renaming temp table).
"Removing rows with duplicated column values" also suggests alteration of data and derails train of thought.
What you do want is to get the entire table, and in cases where the columns you care about have multiple values attached get the highest one. One could say... group by columns you care about? And attach them to the highest value, a maximum value?
select id,name,max(value) from table group by id,name
I'm looking for a way to join two (sometimes more) tables.
I'll start with two and add as I get the pieces working.
Table1 has two columns that identify it
T1ContainerID
T1ObjectID
Table2 has similar columns but starts with T2 but the values will match
T2ContainerID
T2ObjectID
In Table2 there are two columns I am targeting
ObjectName
ObjectValue
There can be any number of ObjectName entries for a given record.
For instance one may have name, address,and a date
another may have Name, address,port,date,ServerName,Device,Status
What I need is a way to pivot all of the potential columns in Table2 in line with Table1 and is that value is not in Table2 for table1 then just make it NULL. I want the header of these columns to be the ObjectName and the value to be ObjectValue. If i can't get a wildcard to grab all potential values i can settle for just calling out each column manually. I was only hoping for a wildcard as it may change as different values for new records get added. Worst case i just adjust code to add anything new.
I do have a bunch of queries that rebuild the database every night and dump it into a different database but I'd like to have a query to pull the results from the main database to get current values rather than something that was run every morning.
Thanks for the contributions so far. After more digging I am re-stating the question (and indeed the title of the question) as follows:
I am selecting just 2 columns from a view that contains several columns. The view returns 50,497 rows if I select all columns, but only 50,496 (i.e. 1 fewer) when I select just 2 columns, these being [Patient_ID] (which is a bigint column) and [Condition_Code] (a varchar(6) column).
Version 1:
SELECT * FROM [vw_Query1]
returns 50,497 rows.
But:
SELECT [Patient_ID], [Condition_Code] FROM [vw_Query1]
returns 50,496 rows.
I can post the code for [vw_Query1] if required, but an understanding at a fundamental level how this can happen when no GROUP BY clause has been used is the key question for me.
UPDATE:
It turns out that if I exclude one particular column, I get the lower number of rows of 50,496. This column is unique in having a Case-Sensitive collation. I still dont understand why it is dropping one particular row but at least I am getting closer to an understanding.
I have a table (a) that contains imported data, and one of the values in that table needs to be joined to another table (b) based on that value. In table b, sometimes that value is in a comma separated list, and it is stored as a varchar. This is the first time I have dealt with a database column that contains multiple pieces of data. I didn't design it, and I don't believe it can be changed, although, I believe it should be changed.
For example:
Table a:
column_1
12345
67890
24680
13579
Table b:
column_1
12345,24680
24680,67890
13579
13579,24680
So I am trying to join these table together, based on this number and 2 others, but when I run my query, I'm only getting the one that contain 13579, and none of the rest.
Any ideas how to accomplish this?
Storing lists as a comma delimited data structure is a sign of bad design, particularly when storing ids, which are presumably an integer in their native format.
Sometimes, this is necessary. Here is a method:
select *
from a join
b
on ','+b.column_1+',' like '%,'+cast(a.column_1 as varchar(255))+',%'
This will not perform particularly well, because the query will not take advantage of any indexes.
The idea is to put the delimiter (,) at the beginning and end of b.column_1. Every value in the column then has a comma before and after. Then, you can search for the match in a.column_1 with commas appended. The commas ensure that 10 does not match 100.
If possible, you should consider an alternative way to represent the data. If you know there are at most two values, you might consider having two columns in a. In general, though, you would have a "join" table, with a separate row for each pair.
I've set up a view which combines all the data across several tables. Is there a way to write this so that only columns which contain non-null data are displayed, and those columns which contain all NULL values are not included?
ADDED:
Sorry, still studying and working on my first big project so every day seems to be a new experience at the minute. I haven't been very clear, and that's partly because I'm not sure I'm going about things the right way! The client is an academic library, and the database records details of specific collections. The view I mentioned is to display all the data held about an item, so it is bringing together tables on publication, copy, author, publisher, language and so on. A small number of items in the collection are papers, so have additional details over and above the standard bibliographic details. What I didn't want was a user to get all the empty fields relating to papers if what was returned only consisted of books, therefore the paper table fields were all null. So I thought perhaps there would be a way to not show these. Someone has commented that this is the job of the client application rather than the database itself, so I can leave this until I get to that phase of the project.
There is no way to do this in sql.
CREATE VIEW dbo.YourView
AS
SELECT (list of fields)
FROM dbo.Table1 t1
INNER JOIN dbo.Table2 t2 ON t1.ID = t2.FK_ID
WHERE t1.SomeColumn IS NOT NULL
AND t2.SomeOtherColumn IS NOT NULL
In your view definition, you can include WHERE conditions which can exclude rows that have certain columns that are NULL.
Update: you cannot really filter out columns - you define the list of columns that are part of your view in your view definition, and this list is fixed and cannot be dynamically changed......
What you might be able to do is us a ISNULL(column, '') construct to replace those NULLs with an empty string. Or then you need to handle excluding those columns in your display front end - not in the SQL view definition...
The only thing I see you could do is make sure to select only those columns from the view that you know aren't NULL:
SELECT (list of non-null fields) FROM dbo.YourView
WHERE (column1 IS NOT NULL)
and so forth - but there's no simple or magic way to select all columns that aren't NULL in one SELECT statement...
You cannot do this in a view, but you can do it fairly easily using dynamic SQL in a stored procedure.
Of course, having a schema which shifts is not necessarily good for clients who consume the data, but it can be efficient if you have very sparse data AND the consuming client understands the varying schema.
If you have to have a view, you can put a "header" row in your view which you can inspect client-side on the first row in your loop to see if you want to not bother with the column in your grid or whatever, you can do something like this:
SELECT * FROM (
-- This is the view code
SELECT 'data' as typ
,int_col
,varchar_col
FROM TABLE
UNION ALL
SELECT 'hdr' as typ
-- note that different types have to be handled differently
,CASE WHEN COUNT(int_col) = 0 THEN NULL ELSE 0 END
,CASE WHEN COUNT(varchar_col) = 0 THEN NULL ELSE '' END
FROM TABLE
) AS X
-- have to get header row first
ORDER BY typ DESC -- add other sort criteria here
If we're reading your question right, there won't be a way to do this in SQL. The output of a view must be a relation - in (over-)simplified terms, it must be rectangular. That is, each row must have the same number of columns.
If you can tell us more about your data and give us some idea of what you want to do with the output, we can perhaps offer more positive suggestions.
In general, add a WHERE clause to your query, e.g.
WHERE a IS NOT NULL AND b IS NOT NULL AND c IS NOT NULL
Here, a b c are your column names.
If you are joining tables together on potentially NULL columns, then use an INNER JOIN, and NULL values will not be included.
EDIT: I may have misunderstood - the above filters out rows, but you may be asking to filter out columns, e.g. you have several columns and you only want to display columns that contain at least one null value across all the rows you are returning. Using dynamic SQL offers a solution, since the set columns varies depending upon your data.
Here's a SQL query that builds another SQL query containing the appropriate columns. You could run this query, and then submit it's result as another query. It assumes 'pk' is some column that is always non-null, e.g. a primary key - this means we can prefix additional row names with a comma.
SELECT CONCAT("SELECT pk"
CASE (count(columnA)) WHEN 0 THEN '' ELSE ',columnA' END,
CASE (count(columnB)) WHEN 0 THEN '' ELSE ',columnB' END,
// etc..
' FROM (YourQuery) base')
FROM
(YourQuery) As base
The query works using Count(column) - the aggregate function ignores NULL values, and so returns 0 for a column consisting entirely of NULLs. The query builder assumes that YourQuery uses aliases to ensure there no duplicate column names.
While you cant put this into a view, you could wrap it up as a stored procedure that copies the data to another table - the result table. You may also set up a trigger so that the result table is updated whenever the base tables change.
I suspect what's going on is that an end user is running CrystalReports and complaining about all the empty columns that have to be removed manually.
It would actually be possible to create a stored procedure that would create a view on the fly, leaving out dataless columns. But then you would have to run this proc before using the view.
Is that acceptable?