SQL Server 2008R2 Indexes for log table - sql

I have to do some logging in my app. Daily payload is about 50000 insertions. I have to store several fields, the most important - event type and event date/time. There gonna be queries with sorting, paging and filtering. What indexes (on what fields and clustered or non-clusterd) should I create in order to minify insertions and query time (at least for select .. where on the fields above)? Googling give various ideas on the subjuct so I can't figure out what to do
UPD
My POCO:
public class LogEntry
{
public DateTime LoggedAt { get; set; }
public int EventType { get; set; }
public bool IsSuccesful { get; set; }
public string Message { get; set; }
public string URL { get; set; }
public string Login { get; set; }
public string IP { get; set; }
public string UserAgent { get; set; }
}
The most frequent query is select .. where (LoggedAt between .. and ..) and (EventType=..). Sometimes there may be additional and parts of where clause.
Also no update operations are planned. Deletions are possible but only occasionally by lagre bulks.

Following statements is only for ilustrate some possible cases. Ofc its difficult to provide you a specific solution (you have to describe your selectivity). But you can see here some points of view and maybe it can help you.
Some rulles that can help you:
more indexes -> hardest insert
- do best index for majority of selectivity on as much as is possible unique value...
clustered index - replaces your heap by B-Tree
nonclustered index - referencing pages to your heap (creates new object) - consumes more space -> index + data
-- your table should seems like :
CREATE TABLE LogEntry (LoggedAt DATETIME,
EventType INT,
IsSuccesful BIT,
Message VARCHAR(511),--check your input to set it correctly
URL VARCHAR(511),--check your input to set it correctly
Login VARCHAR(127),--check your input to set it correctly
IP VARCHAR(63),--check your input to set it correctly
UserAgent VARCHAR(63))--check your input to set it correctly
-- For examlle for following select
SELECT *
FROM LogEntry
WHERE LoggedAt BETWEEN GETDATE() AND DATEADD(dd,-1,GETDATE()) AND
EventType = 1
-- can help following index (ofc unique values is best for clustered indexes)
CREATE CLUSTERED INDEX idx_LogEntry_LoggedAt_EventType ON dbo.LogEntry (LoggedAt,EventType)
-- For example for following select
SELECT Message
FROM LogEntry
WHERE LoggedAt BETWEEN GETDATE() AND DATEADD(dd,-1,GETDATE()) AND
EventType = 1
-- can help following index
CREATE NONCLUSTERED INDEX idx_LogEntry_LoggedAt_EventType ON dbo.LogEntry (LoggedAt,EventType) INCLUDE (Message)
-- and so ... it really depends what you really want...
-- for me can be really helpfull following solution:
CREATE TABLE LogEntryO (LogEntryId INT IDENTITY PRIMARY KEY CLUSTERED, -- my clustered index
LoggedAt DATETIME,
EventType INT,
IsSuccesful BIT,
Message VARCHAR(511),--check your input to set it correctly
URL VARCHAR(511),--check your input to set it correctly
Login VARCHAR(127),--check your input to set it correctly
IP VARCHAR(63),--check your input to set it correctly
UserAgent VARCHAR(63))--check your input to set it correctly
-- + following index
CREATE NONCLUSTERED INDEX idx_LogEntryO_LoggedAt_EventType ON dbo.LogEntryO (LoggedAt) INCLUDE (LogEntryId)
-- and my query should seems
;WITH CTE AS (SELECT LogEntryId FROM dbo.LogEntryO WHERE LoggedAt BETWEEN GETDATE() AND DATEADD(dd,-1,GETDATE()))
SELECT *
FROM dbo.LogEntryO a
JOIN CTE b ON a.LogEntryId = b.LogEntryId
WHERE a.EventType = 1
It is really hard to create best solution for you, because it seems that you using c# class for accesing to this table. For example you could using some kind of ORM , for example entity framework or soo...

It's difficult to give a specific answers without more information. Any way, I think you can try using an index with your two most important field.
Remember (but may be you already know it) the order of the field is important with respect the query you do.
If you know that the query is always the same, you can add others filed (in the index or with INCLUDE column).
Evaluate the "cardinality" of the fields value too.
If possible, give often a look at the information MSSQL stores about the use of the index.
If it is a OLTP system, with frequent update/delete, it could be not positive to add too much indexes.

Related

How can I adjust prisma scheme to 'autofil'

I just started using prisma and was wondering whether you could perform an 'autofill'.
For instance: I have a leaderboard, and whenever I list down the teamID, the teamName column gets filled up automatically.
A piece of the schema is as follows.
model LeagueTable {
id Int #id #default(autoincrement())
competitionId Int
teamId Int
played Int
won Int
drawn Int
lost Int
points Int
goalsFor Int
goalsAgainst Int
goalDifference Int
tname String
competition Competitions #relation("competition-lt", fields: [competitionId], references: [id])
team Teams #relation("team", fields: [teamId], references: [id])
}
model Teams {
id Int #id #default(autoincrement())
name String #unique
matchesAsAway Fixtures[] #relation("awayTeam")
matchesAsHome Fixtures[] #relation("homeTeam")
leagueTable LeagueTable[] #relation("team")
}
I was thinking of adding a relation but at the same time I am trying to normalise the scheme as a possible.
Prisma Studio Leaguetable Preview
As you can see the tName column is empty, and I would need to fill it up manually. Is there a way to have it filled up when inserting the teamID
The only way to accomplish this in Prisma is to add another relation as you mentioned.
It might be possible with custom SQL triggers (like before insert), but you would be well outside the Prisma happy path.

What is the best way to update a known row in SQLite?

For example, if I want to frequently update a user's score during a session,
is there a more performant way than - UPDATE score FROM [databaseName] WHERE name = [username]
I feel like I should not be continuously needing to search through the entire database when the value's location has been previously found.
Thanks in advance
You wouldn't search through the entire database if you had an index on name:
create index idx_<table>_name on <table>(name);
where <table> is the name of the table you are referring to, not the database.
Incidentally, if you wanted to update the table, you would use update, not select. But update can still use the index.
You can use properties here you can search and update only the database when you need it.
private int score;
public int Score
{
get
{
score = searchDatabase();
return score;
}
set
{
UpdateDatabase(value);
score = value;
}
}
Debug.Log(Score);
The only thing that comes to mind that resembles what you mention as the value's location, is the value of the primary key column of the row with the value that you search for.
So, if the primary key in the table is id and you only have the name of the person, you can query once to get the row's id:
SELECT id FROM tablename WHERE name = ?
and then use that id in all subsequent updates:
UPDATE tablename SET score = ? WHERE id = ?
This is the fastest way to do the updates.

Tables design for a simple messaging system

I got a simple message system where every message has one sender and always exact one receiver which is never the sender. So my design is like follow:
create table user
{
PersonID int,
Name varchar(255)
}
create table message
{
MessageID int,
FromPersonID int,
ToPersonID int,
Message varchar(160)
}
To get all messages of a given PersonID I write:
SELECT MessageID FROM message WHERE PersonID=FromPersonID OR PersonID=ToPersonID
Now I got two question:
Is this the proper (and fasted) way to design that relation.
How is this relation described in a Database Diagram?
Yup, that's pretty much the textbook way to do it.
Not sure what you mean by "how is it described in a diagram". In a diagram you would draw two boxes, one for each table. Then there would be two lines connecting User and Message, one labeled "from" and the other labeled "to". The exact shape of the boxes and appearance of the lines depends on what diagramming convention you are using.
You can normalize it according to your query.
for the query
SELECT MessageID FROM message WHERE PersonID=FromPersonID OR PersonID=ToPersonID
you can create a normalized structure
create table user
{
PersonID int,
Name varchar(255)
}
create table message_meta
{
FromPersonID int,
ToPersonID int,
}
create table message_data
{
MessageID int,
Message varchar(160)
}
and fire a query like
SELECT MessageID FROM message_meta WHERE PersonID=FromPersonID OR PersonID=ToPersonID
This will be more efficient. TC

What should I make the type of a "marital status" field?

I have a field in my table "marital status" , the user has to choose (radiobutton) if he's (married, divorced, single, voeuf)
What should I make the type of this field?
Is there a boolean type?
marital status doesn't sound like a boolean anyway. It sounds like an enumeration. A boolean would be married (Y/N), although I think in this day and age you might want to be able to store multiple kinds of relationships in there, and you specified yourself that you need to store 'devorced' as well, so a boolean is out of the question.
So I'd recommend making a table named MaritalStatus, having an ID and a description. Store the various states in there, and make a foreign key to MaritalStatusID in your table.
Make it an INT field , Create another table in your database something like
CREATE TABLE dbo.MaritalStatus
(
M_ID INT PRIMARY KEY NOT NULL,
M_Status NVARCHAR(20)
)
GO
INSERT INTO dbo.MaritalStatus
VALUES
(1, 'Single'),(2,'Married'),(3,'Divorced'),
(4,'Widowed'),(5,'Other'),(6,'Prefer Not to say').... bla bla
Now in your Table in "Marital Status" field refer to a user Marital Status using INT values from dbo.MaritalStatus table's "M_ID".
Boolean or in SQL bit datatype is best when you have a situation where something can be TRUE or NOT TRUE, for someone's Marital Status there can be more than two possible values therefore you should create a separate table for all the possible Marital Status and use Foreign key constraint.
The boolean equivalent for T-SQL is bit.
Though, it seems like you want more than a yes/no answer. In this case use an int and then convert the int to an enum.
Edit: Dukeling removed the C# tag in an edit, so I am not sure how relevant this part is anymore /Edit
The enum:
enum MaritalStatus
{
Single,
Married,
Divorced,
...
}
The int from DB:
int maritalStatusFromDB = //value from DB
Convert int to enum:
MaritalStatus maritalStatus = (MaritalStatus)maritalStatusFromDB;
Be aware that your database may contain int values that are not defined in your enum, such as 10. You can check whether maritalStatusFromDB is a valid MaritalStatus as follows:
bool isValid = Enum.IsDefined(typeof(MaritalStatus), maritalStatusFromDB);
if( isValid == false )
{
//handle appropriately
}

How to optimize LINQ-to-SQL for recursive queries?

I have the following SQL table:
ObjectTable
--------------------------------------------------
| ID | Name | Order | ParentID |
| int PK | nvarchar(50) | int | int FK |
ObjectTable.ParentID is a nullable field with a relationship to a different Object record's ID. LINQ-to-SQL generates an class that looks like:
public class DbObject{
int ID { get; set; }
string Name { get; set; }
int Order { get; set; }
int? ParentID { get; set; }
DbObject Parent { get; set; }
EntitySet<DbObject> ChildObjects { get; set; }
}
When I load a DbObject instance, I need to be able to recursively access the child objects, so that I can write the data to a hierarchical JSON object.
Will I execute a separate query everytime I access the elements via DbObject.ChildObjects? Since DB transactions take the longest, it seems like this is a very inefficient way to load a recursive hierarchy.
What is the best practice for executing a recursive query with LINQ-to-SQL and/or EF?
Is LINQ-to-SQL integrated with Common Table Expressions?
I found Common Table Expression (CTE) in linq-to-sql?
in our case we have created stored procedures with CTE's and put them on the l2s designer and they work like any other Table except they behave as a method rather than property and we have experienced no problem with that so far.