What value is selected into parameter in SQL query without where clause - sql

For example, I have this query
SELECT #param = column from table
What value is pulled into #param?
I tried this and can't figure out the value that is being pulled. It is not the old record or newer one.

The documentation states:
the variable is assigned the last value that is returned
But without a WHERE clause that uniquely identifies a row nor an ORDER BY clause that specifies a unique value for ordering, the row chosen for the variable assignment is undefined and not deterministic when the table has more than one row.
You could add ORDER BY to the query to return the last ordered row. A more efficient method to do that would to be use SELECT TOP(1)...ORDER BY...DESC. Conversely, SELECT TOP(1)...ORDER BY...ASC will return the first ordered row. Again, the order by column(s) need to be unique for a deterministic value.

This is the value in the column referenced. It seems like it should have a TOP 1 in it, with a WHERE Clause designed to fetch 1 row only.

Related

Return the row includes the maximum value of specific column if two rows have the same values.

I have a result from the SQL query, wich is displayed below,
I want to build a SQL query, which can return the row includes maximum number from the last column if any two rows (or more than 2) have the same number from the first column.
For instance, from the table, you can see the top two rows have the same number from the first column, which is 2195333. If the SQL query runs, it will return the first row and the rest of rows, discarding the 2nd row only, since the last column for the 2nd row is 1, which is smaller than 2 from the 1st row.
I was thinking about using the while loop in SQL, like run the loop from the 1st row to the last row, if there are any rows have the same value from the first column, it will return the row which has the maximum value from the last column. Since I am new to SQL, I have no idea how to implement it. Please me help me. Thanks
The question, sample data, and desired results are lacking a bit.
But if I understand your question, you can use the WITH TIES clause in concert with Row_Number()
Example
Select Top 1 with ties *
From YourTable
Order By Row_Number() over (Partition By YourCol1 Order By YourLastCol Desc)
Edit Use Dense_Rank() if you want to see ties

Get latest data for all people in a table and then filter based on some criteria

I am attempting to return the row of the highest value for timestamp (an integer) for each person (that has multiple entries) in a table. Additionally, I am only interested in rows with the field containing ABCD, but this should be done after filtering to return the latest (max timestamp) entry for each person.
SELECT table."person", max(table."timestamp")
FROM table
WHERE table."type" = 1
HAVING table."field" LIKE '%ABCD%'
GROUP BY table."person"
For some reason, I am not receiving the data I expect. The returned table is nearly twice the size of expectation. Is there some step here that I am not getting correct?
You can 1st return a table having max(timestamp) and then use it in sub query of another select statement, following is query
SELECT table."person", timestamp FROM
(SELECT table."person",max(table."timestamp") as timestamp, type, field FROM table GROUP BY table."person")
where type = 1 and field LIKE '%ABCD%'
Direct answer: as I understand your end goal, just move the HAVING clause to the WHERE section:
SELECT
table."person", MAX(table."timestamp")
FROM table
WHERE
table."type" = 1
AND table."field" LIKE '%ABCD%'
GROUP BY table."person";
This should return no more than 1 row per table."person", with their associated maximum timestamp.
As an aside, I surprised your query worked at all. Your HAVING clause referenced a column not in your query. From the documentation (and my experience):
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.

PowerPivot - only newest values on current context

I have a problem with PowerPivot.
Let's have a look at only 3 columns in my data source:
date - clientid - category
Category can only be 1 or 2.
In the data source you can find often the same clientid for a given time period, sometimes with different category.
So in my pivot table, I can see the distinct count of my clients depending on the chosen timeline.
But, of course, the sum of clients for cat=1 and cat=2 is bigger than the distinct count.
Is it possible to count only the newest entries for every clientid, so that the sum of the two cats is the same as the distinct count of my clients?
Thanks in advance to everybody who helps and spend his time for me.
Stefan
This was fun! Thanks for an interesting problem. Normally for this sort of thing we might flag the most recent entry for a given clientid in an extra field, but yours needs to be dynamic at runtime based on your date filter selection.
Here we go. Be warned, it's a doozy.
CountCat:=
COUNTROWS(
FILTER(
GENERATE(
VALUES( ClientCats[clientid] )
,CALCULATETABLE(
SAMPLE(
1
,SUMMARIZE(
ClientCats
,ClientCats[date]
,ClientCats[category]
)
,ClientCats[date]
,DESC
)
,ALL( ClientCats[category] )
)
)
,CONTAINS(
VALUES( ClientCats[category] )
,ClientCats[category]
,ClientCats[category]
)
)
)
Let's work through it.
COUNTROWS() is trivial.
FILTER() takes a table as its first argument. It creates a row context by iterating row-by-row through this table. It evaluates a boolean expression in each row context and returns the rows for which the expression returns true. We're not getting to that expression for a little while here. Let's look at the table we'll be filtering.
GENERATE() takes a table as its input and creates a row context by iterating row-by-row through that table. For each row context it evaluates a second table, and cross joins the rows that exist in the second table expression in the current row context from the first table with the row from the first table.
Our first table is VALUES( ClientCats[clientid] ), which is simply a distinct list of all [clientid]s in context from the pivot table.
We then evaluate CALCULATETABLE() for each row context, aka for each [clientid]. CALCULATETABLE() evaluates a table expression in the filter context determined by its second and subsequent arguments.
SAMPLE() is the table we'll evaluate. SAMPLE() is like TOPN(), but with ties broken non-deterministically. SAMPLE( 1, ... ) always returns one row. TOPN( 1, ... ) returns all rows that are tied for first position.
SAMPLE(), here, will return one row from the table defined by SUMMARIZE(). SUMMARIZE() groups by the fields in a table that are named. Thus we have a table of all distinct values of [date] and [category] that are included based on the context determined by our CALCULATETABLE(). SAMPLE()'s third argument defines a sort-by column to determine which rows are first, and its fourth determines the sort order. Thus for each [clientid] we are returning the latest row in the SUMMARIZE() for that [clientid].
The ALL() in our CALCULATETABLE() strips the context from the field [category] that might be coming in from our pivot table. This means that every time we evaluate our GENERATE() (remember we're still in that function here), we get a table of all [clientid]s that exist in context, and their most recent [category], even when we're evaluating in a pivot cell that has filtered [category].
That sounds like a problem - we'd expect the same count now for every pivot cell. And that's what we'd get if we did COUNTROWS( GENERATE() ). But wait, we're still in FILTER()!
Now we get to the boolean expression which will filter the rows of that GENERATE(). CONTAINS() takes a table as its first argument, a reference to a column in that table as its second argument, and a scalar value as its third argument. It returns true if the column in argument 2, of the table in argument 1, contains the value in argument 3.
We are outside of the CALCULATETABLE(), and therefore context exists on [category]. VALUES() returns the unique rows in context. In any pivot cell filtered by [category], this will be a 1x1 table, but in our grand total, it will have multiple rows.
So, the column in that VALUES() we want to test is [category] (the only column that exists in that VALUES()).
The value we want to test for is referred to by ClientCats[category]. That third argument evaluates [category] in the row context determined by FILTER(). Thus we return true for every row that matches the current filter context (in a pivot cell) of ClientCats[category]. Mind-bending stuff here.
Anyway, the upshot is that in a [category]-filtered pivot cell, we get the number of distinct [clientid]s that have, for the time frame selected, that [category] value as their most recent category.
For the grand total we get every [clientid] in context.
This will probably not have a very good performance curve.
Here's a sample workbook to play with the functioning measure defined.
Edit
Based on replies below.
Do you need to maintain in the model all the rows that have [UseClient] <> 1? Deduping and flagging is always easier in tools other than Power Pivot.
I have no idea how you've determined the values for 1 in [UseClient]. None are the most recent entry for a given [ClientID]. If you want to just flag the most recent row, which is what it sounds like you want, but not what your workbook looks like, you can do a calculated column much more easily than doing this in a measure:
=SAMPLE(
1
,CALCULATETABLE( // return all dates for the [clientid] on current row
VALUES( ClientCats[date] )
,ALLEXCEPT( ClientCats, ClientCats[clientid] )
)
,ClientCats[date]
,DESC
) = ClientCats[date] // row context in table
This will return true when the value of [date] on a given row is equal to the maximum [date] for the client on that row.
One thing you could easily do in Power Query is to group [clientid] and take the max date for each [clientid]. Then you have one row per client.
This is all different than your original question, though, because your original wants to find the maxes based on date selection. But a calculated column is not updated based on filter context. It's only recalculated at model refresh time. If you're willing to use a calculated column, then just deal with your data issues before bringing it into Power Pivot.

Column value divided by row count in SQL Server

What happens when each column value in a table is divided with the total table row count. What function is basically performed by sql server? Can any one help?
More specifically: what is the difference between sum(column value ) / row count and column value/ row count. for e.g,
select cast(officetotal as float) /count(officeid) as value,
sum(officetotal)/ count(officeid) as average from check1
where officeid ='50009' group by officeid,officetotal
What is the operation performed on both select?
In your example both will be allways the same value because count(officeid) is allways equal to 1 because officeid is contained in the WHERE clause and officetotal is also contained in GROUP BY clause. So the example will not work because no grouping will be applied.
When you remove officetotal from the GROUP BY, you will get following message:
Column 'officetotal' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
It means that you cannot use officetotal and SUM(officetotal) in one select - because SUM is meant to work for set of values and it is pointless to SUM only one value.
It is just not possible to write it this way in SQL using GROUP BY. If you look for something like first or last value from a group, you will have to use MIN(officetotal) or MAX(officetotal) or some other approach.

how to find index of a row in sql server?

How can i fetch column names from a table on index basis, like I want to make a tables whose column name should be the name of last column fields of a result set of a query, but those result sets last columns value may be different at different execution time, so i want to know how can i fetch those index value of that last column to make a temp table with column name of those last columns value of a result set.
Is there any way/function in sql server to dynamically form that?
sp_helpindex:
Reports information about the indexes
on a table or view.
You can also use ROW_NUMBER as explained here