I have a measure that counts distincted ID's on some fact table.
Let's say it looks like this:
[id] [linkedtableid] [datecolumn]
1 someid date1
2 someid date1
3 someid date1
4 someid date1
5 null date1
You may see that for date1 there is 5 distinct rows. But in my case it results count = 4. I thought that this can be connected somehow with UnknownMember processing, but I ended up with nothing with this assumption. I've already tried everything in my cube solution, but can't find the reason of such behavior. It seems like row with null value in it just doesn't count by distinct count function.
Also, if I fill this null value in relational DB and then reprocess the cube, all is counting correctly.
I probably missed something, maybe some option somewhere.
Resolved by removing unneeded relations between the measure for distinct count and dimensions. There was 2 other dimensions, one connected through direct link, one through referenced. I don't know why nulls were not calculated there, maybe because of unability to link via reference link with null-valued field.
Related
Okay so i have been trying to find out if its possible to return how many times a particular bit of data is returned
Event_id
---------
Change
Change
Change
Problem
Task
so i want to find out how many times this string data is returned and pop out a value say for change i would expect 3 and so on.
i was hoping this would be possible in a where statement but i have never used count so unsure on how it all works.
Sounds like a classic count function which requires group by clause; it should contain all non-aggregated columns (event_id in this case).
select event_id,
count(*)
from your_table
group by event_id
I wanted to retrieve some data from a table based on two columns see the below table structure
Update
i want the output data based on two condition
1. if the code value is having 'Web' or 'Offline'.
2. Memo column is having data same as Pre_memo column.
Output should be as shown below
So far i got the output by using same table two times but i wanted to get the output result by using the table only 1 time to avoid performance related issues as this table is having huge data.
select distinct OrderTable.Memo,
max(OrderTable.Memo_Date) as Date1,
max(ot.Pre_Memo_Date) as Date2
from OrderTable,
OrderTable ot
where OrderTable.code in ('Web')
and ot.code in ('Offline')
and OrderTable.Memo = ot.Pre_Memo
group by OrderTable.Memo
Can anyone help on this? With the use of OrderTable only once in the query and filter based on memo and pre_memo column as it's having same data?
You can use union all and do the conditional aggregation :
select Memo, max(case when code = 'Offline' then Date end) as Memo_date,
max(case when code = 'Web' then Date end) as Per_Memo_date
from (select Date, 'Web' as code, Pre_memo as Memo
from OrderTable o
where code = 'Web'
union all
select Date, 'Offline', Memo
from OrderTable o
where code = 'Offline'
) t
group by Memo;
"I wanted to retrieve some data from a table based on two columns see the below table structure"
Providing a sample is sufficient to illustrate the problem (and it is desirable to do so on SO) but it is not sufficient and thus not a replacement for defining the problem, which you have failed to do.
Absent such definition of the problem, we can only guess what you're trying to achieve. E.G.
from the subset of tuples that have 'Offline' for 'code' value, take the MAX() 'Date' value per appearing value of 'Memo'.
Match that (using some matching condition) to the subset of tuples that have 'Web' for 'code value and retain the 'Date' value from those as 'Memo_date' in the result set.
matching condition being that 'Memo' value of [a tuple in] the former is equal to 'Pre_memo' value in [the matching tuple in] the latter.
If all that is correct, then that explains why it is impossible to do this in SQL without having at least two references. You cannot avoid doing some kind of matching, and matching by definition takes two distinct things to match (even if the two distinct things are distinct subsets of one and the same thing). In fact it is almost certainly a fundamental design mistake for you to have those two distinct things in one single table, probably under the totally misguided belief that "having everything in one table makes things easier".
"So far i got the output by using same table two times but i wanted to get the output result by using the table only 1 time to avoid performance related issues as this table is having huge data"
From the way you have presented the question, I suspect that you were hoping for some means to exploit the fact that those 'Offline' tuples are "the next" after a 'Web' tuple, and that you could write the SQL in such a way that the engine could then derive a sort of "single pass" algorithm (which you probably assume would go faster).
It does not work like that. SQL tables have no inherent ordering and as a consequence there simply ain't no such thing as "the next" in a table.
I have many measures of distinct count in a cube. My problem is that those measures count the null value as well. I've found two solutions to eliminate the null value:
I've created named queries in data source view for each measure where i put the condition that the column that i need does not contains null [where column is not null] (but this solution is not that practical, because if you have many measures, that do not need to count the null value you have to make a lot of fact tables as named queries to eliminate the null)
I've created an additional column as Named calculation in the fact table, where i tested if the column that i need contains null to put 1 else to put 0 (CASE WHEN Column IS NULL THEN 1 ELSE 0). After that i created a measure of maximum on this additional column and i created a measure of distinct count on the column that i needed . And finally, i created a calculation where i tested the following: IIF([measure that i need]- [Maximum of additional column]<0,null,[measure that i need]- [Maximum of additional column])
Both solutions works but my question is if there is another solution more simple than those two mentioned or if there is an option in SSAS.
If someone knows please share the information.
In Sql it is possible to use
select count(column_name) from table.
this doesn't count the null values.
count(*) does count the null values.
I have a simple join table with two id columns in SQL Server.
Is there any way to select all rows in the exact order they were inserted?
If I try to make a SELECT *, even if I don't specify an ORDER BY clause, the rows are not being returned in the order they were inserted, but ordered by the first key column.
I know it's a weird question, but this table is very big and I need to check exactly when a strange behavior has begun, and unfortunately I don't have a timestamp column in my table.
UPDATE #1
I'll try to explain why I'm saying that the rows are not returned in 'natural' order when I SELECT * FROM table without an ORDER BY clause.
My table was something like this:
id1 id2
---------------
1 1
2 2
3 3
4 4
5 5
5 6
... and so on, with about 90.000+ rows
Now, I don't know why (probably a software bug inserted these rows), but my table have 4.5 million rows and looks like this:
id1 id2
---------------
1 1
1 35986
1 44775
1 60816
1 62998
1 67514
1 67517
1 67701
1 67837
...
1 75657 (100+ "strange" rows)
2 2
2 35986
2 44775
2 60816
2 62998
2 67514
2 67517
2 67701
2 67837
...
2 75657 (100+ "strange" rows)
Crazy, my table have now millions of rows. I have to take a look when this happened (when the rows where inserted) because I have to delete them, but I can't just delete using *WHERE id2 IN (strange_ids)* because there are "right" id1 columns that belongs to these id2 columns, and I can't delete them, so I'm trying to see when exactly these rows were inserted to delete them.
When I SELECT * FROM table, it returns me ordered by id1, like the above table, and
the rows were not inserted in this order in my table. I think my table is not corrupted because is the second time that this strange behavior happens the same way, but now I have so many rows that I can delete manually like it was on 1st time. Why the rows are not being returned in the order they were inserted? These "strange rows" were definetely inserted yesterday and should be returned near the end of my table if I do a SELECT * without an ORDER BY, isn't it?
A select query with no order by does not retrieve the rows in any particular order. You have to have an order by to get an order.
SQL Server does not have any default method for retrieving by insert order. You can do it, if you have the information in the row. The best way is a primary key identity column:
TableId int identity(1, 1) not null primary key
Such a column is incremented as each row is inserted.
You can also have a CreatedAt column:
CreatedAt datetime default getdate()
However, this could have duplicates for simultaneous inserts.
The key point, though, is that a select with no order by clause returns an unordered set of rows.
As others have already written, you will not be able to get the rows out of the link table in the order they were inserted.
If there is some sort of internal ordering of the rows in one or both of the tables that this link table is joining, then you can use that to try to figure out when the link table rows have been created. Basically, they cannot have been created BEFORE both of the rows containing the PK:s have been created.
But on the other hand you will not be able to find out how long after they have been created.
If you have decent backups, you could try to restore one or a few backups of varying age and then try to see if those backups also contains this strange behaviour. It could give you at least some clue about when the strangeness has started.
But the bottom line is that using just a select, there is now way to get the row out of a table like this in the order they were inserted.
If SELECT * doesn't return them in 'natural' order and you didn't insert them with a timestamp or auto-incrementing ID then I believe you're sunk. If you've got an IDENTITY field, order by that.
But the question I have is, how can you tell that SELECT * isn't returning them in the order they were inserted?
Update:
Based on your update, it looks like there is no method by which to return records as you wish, I'd guess you've got a clustered index on ID1?
Select *, %%physloc%% as pl from table
order by pl desc
I have column in fact table .the column in some row has 'Null' value.i have measure based on this column with aggregate function Set to DistinctCount
this measure count null value too.
but i don't want to count null value what should i do?
Most efficient would be to filter out NULL values in the data source view (using a named query for example). This won't affect performance too much as a distinct count measure is calculated in a separate measure group anyway.
One popular solution that works is to count from a view of the table that filters out the nulls. This works, but I would bet that it requires another scan of the fact table.
Another solution is like fighting fire with fire.
Add a computed column that is 0 if it's null and 1 if it's not:
CASE WHEN _DollarsLY IS NULL THEN 0 ELSE 1 END AS _DistinctCountHackLY
Then you can do something like this in a cube calculation:
iif(_DistinctCountHackLY=2 or _DollarsLY=null,_DistinctUPCLY-1,_DistinctUPCLY)