Count non null values in multiple columns with LINQ - sql

The following SQL query counts non-null values of multiple columns in a single query (as in this answer):
SELECT COUNT(Id) AS Total,
COUNT(Column_1) AS Column_1_Non_Null_Count,
COUNT(Column_2) AS Column_2_Non_Null_Count,
COUNT(Column_3) AS Column_3_Non_Null_Count,
...
FROM MyTable
Is there a corresponding Linq query which executes a SQL query similar to this one (without a subquery for each column count)?
Counting null values instead of non-null values would also be ok.

I'm not sure that exist a good way to do it with Entity Framework, I think that is better to do it with raw sql.
But assuming that you want to do it with Entity Framework, may be one way to do it is creating several queries using FutureCount method from EF.Extended library. Using Future methods from EF.Extended, all queries are postponed until is accessed the result of one of the queries and the data will be retrieved in one round trip to the database server.
var queryColumn1 = MyDBContext.MyTable.Where(q => q.Column1 == null).FutureCount();
var queryColumn2 = MyDBContext.MyTable.Where(q => q.Column2 == null).FutureCount();
...
int countColumn1 = queryColumn1.Value;
int countColumn2 = queryColumn2.Value
What I dislike of this solution is the readibility of the code, as I said I think that the good approach is do it using raw sql or using a stored procedured

Related

Counting results in SQLite, given query with functions

As you may (or may not) already know, SQLite does not provide information about total number of results from the query. One has to wrap the query in SELECT count(*) FROM (original query); in order to get row count.
This worked perfectly fine for me, until one of users created custom SQL function (you can define your own functions in SQLite) that does INSERT into another, unrelated table. Then he executes query:
SELECT customFunction() FROM primaryTable WHERE primaryKeyColumnId = 1;
The query returns always 1 row, that is certain. It turns out that customFunction() was called twice (and inserted to that other table 2 rows) and that's because my application called his query as usuall and then called count(*) on that query as a followup.
How to approach this problem? How to execute only the original query and still have a row count from SQLite?
I'm using SQLite (3.13.0) C API.
You either have to remove such function calls from the query, or you cannot get the row count before actually having stepped through all the result rows.

What is the most efficient way to process rows in a table?

I am teaching myself basic and intermediate SQL concepts for a project I am working on.
I have a lot of data that needs to undergo processing so it can be presented in different ways. Right now I am using scalar functions calls in my select statement to process the data.
A simple example, lets say I have an attribute in my table called fun as data type int. I want to process my table so that all rows with fun < 10 are 'foo' and all rows with fun > 10 are 'faa'.
So I write an SQL function something like
CREATE FUNCTION dbo.fooORfaa
(
#fun AS int
)
RETURNS VARCHAR(3)
AS
BEGIN
IF (#fun < 10)
RETURN 'foo'
RETURN 'faa'
END
Then I use my function in something like this select statement
select dbo.fooORfaa([mytable].[fun]) AS 'blah'
from mytable
This example is trivial, but in my real code I need perform some fairly involved logic against one or more columns, and I am selecting sub results from procedures and joining tables together and other things you need to do in a data base.
I have to process lots of records in a reasonable time span. Is this method an efficient way to tackle this problem? Is there another technique I should be using instead?
For this use case, you need a CASE construct.
SELECT
CASE
WHEN T.fun < 10 THEN 'foo'
ELSE 'faa'
END foo_faa
FROM
myTable T
Always try to use set-based operations. User-defined functions will (mostly) kill your performance, and should be a last resort.
See: CASE (Transact-SQL)

SQL - IN clause vs equals operator for small list

Which should be the preferred and efficient way?
where #TeamId in (Team1Id, Team2Id)
or
where #TeamId=Team1Id or #TeamId=Team2Id
I am using sql server 2008.
Edit
When I checked execution plans, both the queries showed that they are using indexes and same execution plan.
Both are same
SQL server converts this
where #TeamId in (Team1Id, Team2Id)
Into
where #TeamId=Team1Id or #TeamId=Team2Id
It's better to write IN compare to OR more readable and easy.
For the specific example yo provide, of testing a variable, IN is simply syntactic sugar for multiple OR's.
However in the related case of selecting rows of a relation the use of a join to another relation is superior, particulalry if the data field being compared is indexed or the list of comparison values grows. Such a comparison relation is easily created using a static sub-query like this:
select *
from data
join (
select Team1Id as TeamId union all
select Team2Id
) comparison on comparison.TeamId = data.TeamId
This technique of a static sub-query is widely applicable to many circumstances.

Rails 3, active record. How do I get an object from it's nested values

I am using Postgres and have the (working) SQL as something like:
SELECT
distinct(users.id),
users.*
FROM
public.addons,
public.containers,
public.users
WHERE
( (addons."value" = 'Something' AND addons."name" = 'bob') OR
addons."value" = 'Something else' AND addons."name" = 'bill') AND
(containers.id = addons.site_id AND
users.id = containers.user_id)
What I want to do is format this so that it returns a set of user objects. I am not sure how I format this so that its something like (not working):
#users = User.find(:include => [:users, :containers, :addons], :conditions => {( (addons."value" = 'Something' AND addons."name" = 'bob') OR
addons."value" = 'Something else' AND addons."name" = 'bill') AND
(containers.id = addons.site_id AND
users.id = containers.user_id) } )
Is this even possible?
First, you should understand that DISTINCT is not a function, it's a query modifier. The parentheses don't change that. DISTINCT means if there are any rows where every column returned are identical, reduce the duplicates to a single row. You can't reduce to a single row per distinct value in a single column (that's the job of GROUP BY).
Second, SQL result sets properly have one value per row per column. You could do some kind of aggregated string-concatenation like MySQL's GROUP_CONCAT, or an equivalent aggregate function in PostgreSQL, but then you've got a string mashing together all the values, which you then have to explode in Ruby code after fetching it.
So if you have multiple rows to fetch for a given users.id just let them come back to the app in multiple rows. Then do some post-processing of the data in your application. Loop over fetching the rows, and add the attributes to a Ruby object one by one. If you're fetchiing data for multiple user id's in a given query, stuff the attributes into a hash of your user objects, keyed by user id. That way you don't need to fetch the rows in any particular order.
Third, I'm inferring from your "name" and "value" columns that you're using the Entity-Attribute-Value design, where users have multiple attributes, one per row. This is a design that subverts traditional relational data access, so you do end up having to write application code to preprocess and postprocess the data. If you could put each attribute into its own conventional column, which would be the relational way of storing data, your SQL queries would be a lot simpler.

CreateCriteria and MONTH

Is it possible to use MONTH in a CreateCriteria-statement?
Does NHibernate support YEAR and/or MONTH?
I have a sql-statement like
select obs2.Lopnr from Obs obs2 where MONTH(obs2.Datum)=11)
Best Regards from
Mats
ICriteria supports arbitrary SQL as a restriction. Therefore you could do:
var criteria = session.CreateCriteria(typeof(Obs))
.Add(Expression.Sql("MONTH({alias}.Datum) = ?", 11, NHibernateUtil.Int32);
var results = criteria.List<Obs>();
This will execute a SQL query with {alias} replaced by the alias that NHibernate is using for the Obs table. Of course, this limits your portablility to other databases, as SQL is now embedded in your query.
Another thing to remember is that the names you're using here are the mapped property and class names, not the underlying column and table names.
I don't believe it's possible in a criteria statement. The date functions (year, month, day) are supported in HQL queries so they are usable in that way.