Simplest way to persist data in Azure - recommended options?

Simplest way to persist data in Azure - recommended options? - asp.net-mvc-4

I'm giving Azure a go with MVC4, and have the simplest data storage requirement I can think of. Each time my controller is hit, I want to record the datetime and a couple of other details to some location. There will most likely be only a few thousand hits at most per month. Then I'd like to view a page telling me how many hits (rows appended) there are.
Writing to a text file in the folder with Server.MapPath... gives permission errors and seems not to be possible due to the distributed nature of it. Getting a whole SQL instance is $10 a month or so. Using the table or blob storage sounds hopeful, but setting up the service and learning to use those seems nowhere near as simple as just a basic file or DB.
Any thoughts would be appreciated.

Use TableStorage. For all intents and purposes it's free (it'll be pennies per month for that amount of volume a fraction of your web roles anyway).
As for how complicated you think it is, it's really not. Have a look at this article to get going. http://www.windowsazure.com/en-us/develop/net/how-to-guides/table-services/#create-table
//Create a class to hold your data
public class MyLogEntity : TableEntity
{
public CustomerEntity(int id, DateTime when)
{
this.PartitionKey = when;
this.RowKey = id;
}
public MyLogEntity () { }
public string OtherProperty { get; set; }
}
//Connect to TableStorage
var connstr = CloudConfigurationManager.GetSetting("StorageConnectionString") //Config File
var storageAccount = CloudStorageAccount.Parse(connstr);
// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
// Create the table if it doesn't exist.
var table = tableClient.GetTableReference("MyLog");
table.CreateIfNotExists();
var e= new MyLogEntity (%SOMEID%, %SOMEDATETIME%);
e.OtherValue = "Some Other Value";
// Create the TableOperation that inserts the customer entity.
var insertOperation = TableOperation.Insert(customer1);
// Execute the insert operation.
table.Execute(insertOperation);

Augmenting #Eoin's answer a bit: When using table storage, tables are segmented into partitions, based on the partition key you specify. Within a partition, you can either search for a specific row (via row key) or you can scan the partition for a group of rows. Exact-match searching is very, very fast. Partition-scanning (or table-scanning) can take a while, especially with large quantities of data.
In your case, you want a count of rows (entities). Storing your rows seems pretty straightforward, but how will you tally up a count? By day? By month? By year? It may be worth aligning your partitions to a day or month to make counting quicker (there's no function that returns number of rows in a table or partition - you'd end up querying them).
One trick is to keep an accumulated value in another table, each time you write a specific entity. This would be very fast:
Write entity (similar to what Eoin illustratecd)
Read row from Counts table corresponding to the type of row you wrote
Increment and write value back
Now you have a very fast way to retrieve counts at any given time. You could have counts for individual days, specific months, whatever you choose. And for this, you could have the specific date as your partition key, giving you very fast access to the correct entity holding the accumulated count.

Related

Checking Whether Table Data Exists, Updating / Inserting Into Two Tables & Posting End Outcome

I am working on my cron system which gathers informaiton via an API call. For most, it has been fairly straight forward, but now I am faced with multiple difficulties, as the API call is dependant on who is making the API request. It runs through each users API Key and certain information will be visible/hidden to them and visaversa to the public.
There are teams, and users are part of teams. A user can stealth their move, however all information will be showed to them and their team, however this will not be visible to their oponent, however both teams share the same id and have access tothe same informaiton, just one can see more of it than the other.
Defendants Point Of View
"attacks": {
"12345`": {
"timestamp": 1645345234,
"attacker_id": "",
"attacker_team_id": "",
"defender_id": 321,
"defender_team_id": 1,
"stealthed": 1
}
}
Attackers Point Of View
"attacks": {
"12345`": {
"timestamp": 1645345234,
"attacker_id": 123,
"attacker_team_id": 2
"defender_id": 321,
"defender_team_id": 1,
"stealthed": 1,
"boosters": {
"fair_fight": 3,
"retaliation": 1,
"group_attack": 1
}
}
}
So, if the defendant's API key is first used, id 12345 will already be in the team_attacks table but will not include the attacker_id and attacker_team_id. For each insert there after, I need to check to see whether the new insert's ID already exist and has any additional information to add to the row.
Here is the part of my code that loops through the API and obtains the data, it loops through all the attacks per API Key;
else if ($category === "attacks") {
$database = new Database();
foreach($data as $attack_id => $info) {
$database->query('INSERT INTO team_attacks (attack_id, attacker_id, attacker_team_id, defender_id, defender_team_id) VALUES (:attack_id, :attacker_id, :attacker_team_id, :defender_id, :defender_team_id)');
$database->bind(':attack_id', $attack_id);
$database->bind(':attacker_id', $info["attacker_id"]);
$database->bind(':attacker_team_id', $info["attacker_team_id"]);
$database->bind(':defender_id', $info["defender_id"]);
$database->bind(':defender_team_id', $info["defender_team_id"]);
$database->execute();
}
}
I have also been submitting to the news table, and typically I have simply been submitting X new entries have been added or whatnot, however I haven't a clue if there is a way to check during the above if any new entries and any updated entries to produce two news feeds:
2 attacks have bee updated.
49 new attack information added.
For this part, I was simply counting how many is in the array, but this only works for the first ever upload, I know I cannot simply count the array length on future inserts which require additional checks.
If The attack_id Does NOT Already Exist I also need to submit the boosters into another table, for this I was adding them to an array during the above loop and then looping through them to submit those, but this also depends on the above, not simply attempting to upload for each one without any checks. Boosters will share the attack_id.
With over 1,000 teams who will potentially have at least one members join my site, I need to be as efficient as this as possible. The API will give the last 100 attacks per call and I want this to be within my cron which collects any new data every 30 seconds, so I need to sort through potentially 100,000.

In SQL, you can check conditions when inserting new data using merge:
https://en.wikipedia.org/wiki/Merge_(SQL)
Depending on the database you are using, the name and syntax of the command might be different. Common names for the command are also upsert and replace.
But: If you are seeking for high performance and almost-realtimeness, consider using a cache holding critical aggregated data instead of doing the aggregation 100'000 times per minute.

This may or may not be the "answer" you're looking for. The question(s) imply use of a single table for both teams. It's worth considering one table per team for writes to avoid write contention altogether. The two data sets could be combined at query time in order to return "team" results via the API. At scale, you could have another process calculating and storing combined team results in an API-specific cache table that serves the API request.

SSIS Inserting incrementing ID with starting range into multiple tables at a time

Is there are one or some reliable variants to solve easy task?
I've got a number of XML files which will be converting into 6 SQL tables (via SSIS).
Before the end of this process i need to add a new (in fact - common for all tables) column (or field) into each of them.
This column represents ID with assigning range and +1 incrementing step. Like (350000, 1)
Yes, i know how to solve it on SSMS SQL stage. But i need a solution at SSIS's pre-SQL converting lvl.
I'm sure there should be well-known pattern-solutions to deal with it.

I am going to take a stab at this. Just to be clear, I don't have a lot of information in your question to go on.
Most XML files that I have dealt with have a common element (let's call it a customer) with one to many attributes (this can be invoices, addresses, email, contacts, etc).
So your table structure will be somewhat star shaped around the customer.
So your XML will have a core customer information on a 1 to 1 basis that can be loaded into a single main table, and will have array information of invoices and an array of addresses etc. Those arrays would be their own tables referencing the customer as a key.
I think you are asking how to create that key.
Load the customer data first and return the identity column to be used as a foreign key when loading the other tables.
I find it easiest to do so in script component. I'm only going to explain how to get the key back. I personally would handle the whole process in C# (deserializing and all).
Add this to Using Block:
Using System.Data.OleDB;
Add this into your main or row processing depending on where the script task / component is:
string SQL = #"INSERT INTO Customer(CustName,field1, field2,...)
values(?,?,?,...); Select cast(scope_identity() as int);";
OleDBCommanad cmd = new OleDBCommand();
cmd.CommandType = System.Data.CommandType.Text;
cmd.CommandText = SQL;
cmd.Parameters.AddWithValue("#p1",[CustName]);
...
cmd.Connection.Open();
int CustomerKey = (int)cmd.ExecuteScalar(); //ExecuteScalar returns the value in first row / first column which in our case is scope_identity
cmd.Connection.Close();
Now you can use CustomerKey for all of the other tables.

Lua - Managing tables... Object Oriented?

For a while now, I have been managing Lua tables via the use of functions and whatnot for character progress tracking. Lately, I've been reading more and more about OO methods and metatables, and I'm wondering if that is a better way to handle this. For instance, here's a breakdown of progress tracking:
When a character is first initialized, I do:
init_tracker(pname, reset)
which pulls a text file template of the starter database. This template breaks down the table as such:
Player Name {
Exp {
Several keys here, values as 0
},
Quests {
Several keys here, values as 0
},
Campaigns {
Several keys here, values as 0
},
}
etc. There are other keys under the Player name, but that's a gist of what the table structure looks like. I keep track of yearly, monthly, weekly, daily, hourly, and per level stats that reset appropriately, as well as track their previous stats (last week, last year, last month, etc.). For that, I have functions that iterate through each table, copies to the last table, then resets the values on the current table.
Would it be better to use metatables and methods for this? For instance, should I be doing:
Player["Current"]:update()
with an update function as a method, rather than
for i, v in pairs(Player["Current"]) do
Player["Last"][i] = v
Player["Current"][i] = 0
end
(which, by the way, does not seem to work anymore, as it's always displaying 0)?
If I would be better off using OO, how would I structure that into what I have now?

Unsolvable events scenario in RavenDB?

Edit. Here's a simplified description of the issue:
I have events
class Event { Id = 0, Dates = new DateTime[] {} }
I need to query for all events within a date range for example (august 1 to october 20). The result shall list uniqe events within this range ordered by date. Like this:
Event one 2012-08-04,2012-09-06,2012-09-10
Event two 2012-10-02
etc.
I need to be able to page this result. That's it.
I have the following issue with my events using ravendb. I have a document (representing an event) that contains an array of dates, for example 2012-08-20, 2012-08-21, 2012-09-14, 2013-01-05 etc.
class Event { Dates = []; }
I have a few criteria that must be met:
I need to be able to query these documents on a date range. For example find all events that has any date between august 1 and september 22, or october 1 and october 3.
I must be able to sort the query on date
I must be able to page the result.
Seems easy enough right? Well I have had two approaches to this but they both fail.
Create an index with multiple from. Like so:
from event in docs.Events
from date in event.Dates
select new { Dates = date}
This works and is easy to query. However it can't be paged because of skippedresults (the index will contain duplicates of each event). And sorting also fails in combination with paging.
...............
Create a simple index
from event in docs.Events
select new { Dates = event.Dates }
This also works and is simple to query, it can also be paged. However it cannot be sorted. I need to sort the documents by the first available date within the queried range.
If I can't solve this it will probably be a deal breaker for us.. and I really don't want to get started with a new application, besides I really like RavenDB..

I had a similar requirement (recurring events), with the added twist that the number of dates was highly variable (could be from 1 to the hundreds) and may be associated with a venue.
I ended up storing event dates in an EventOccurrence after coming to a similar impasse:
public class EventOccurrence {
public string EventId {get; set;}
public DateTime Start {get; set;}
public string VenueId {get; set;}
}
Its easily queryable and sortable, and using Session.Include on the occurrence we still retain query performance.
It seems like a reversion to the relational model, but it was the correct choice given our constraints.

You're saying that the "simple index" approach works except for sorting right?
from event in docs.Events
select new
{
Dates = event.Dates,
FirstDate = event.Dates.First()
}
Then you can sort on FirstDate. You can't sort on analyzed or tokenized fields.

My final solution for this was creating both indexes. We needed both indexes anyway in our application so it wasn't any overhead.
The following index is used for querying and paging. Take() and Skip() works for this one:
from event in docs.Events
from date in event.Dates
select new { Date = date}
However the above index does NOT return the correct number of total hits, which you need for creating a pager. So we create another index:
from event in docs.Events
select new { Date = event.Dates }
Now we can run the exact same query (note that the Date field has the same name on both indexes) on the above index using Statistics() and Take(0) to only get the number of hits.
The downside to this is obviously that you need to run two queries, but I haven't found a way around that.

How do I efficiently create a TRIGGER which, on occasion, generates a VIEW?

On a small pre-registration database, I'm using the following SQL to generate a VIEW whenever a specific user name is given. I'm using this mainly to get a snapshot whenever a sysadmin suspects duplicate names are registering. This will be done rarely (atmost once per hour), so the database schema shouldn't be getting excessively big.
CREATE OR REPLACE TRIGGER upd_on_su_entry
AFTER UPDATE OR INSERT
ON PRE_REG_MEMBER
FOR EACH ROW
BEGIN
IF :new.MEMBER_NAME = 'SysAdmin Dup Tester' THEN
EXECUTE IMMEDIATE 'CREATE OR REPLACE VIEW mem_n AS SELECT :old.MEMBER_NAME, COUNT(:old.MEMBER_NAME) FROM MEMBER GROUP BY MEMBER_NAME';
END IF;
END;
However, this appears to be a bloated, inefficient and erroneous way of working (according to my admin). Is there a fundamental error here ? Can I take an equivalent snapshot some other way?
I'm very new to SQL, so please bear with me.
Also, I want to be using the view as :
public void dups()throws Exception
{
Calendar cal = Calendar.getInstance();
jt.setText("Duplicate List at : "+ cal.getTime());
try{
rs=stat.executeQuery("select * from upd_on_su_entry");
while(rs.next())
{
jt.append(rs.getString("MEMBER_NAME")+"\t");
jt.append(rs.getString(2)+"\t");
}
}
catch(Exception e){System.out.print("\n"+e);}
}

There seem to be some misunderstandings here.
1.) views are basically stored sql statements, not stored sql results, so your view will always display the data as it is at the point of querying the view.
2.) Never ever use DDL (create statements) and similar during normal processing of an application. Its just not the way databases are intended to work.
If you want a snapshot at a point in time, create a secondary table which contains all the columns of the original table plus a snapshot time stamp.
When ever you want to make a snapshot copy all the data you want from the original table into the snapshot table while adding the current time stamp.

Based on your comment, it sounds like you want something like this
SELECT MEMBER_NAME FROM PRE_REG_MEMBER
GROUP BY MEMBER_NAME HAVING COUNT(*) > 1;
This will return all members with more than one row in the table

Again, ask yourself "what am I trying to do"?
Don't focus on the technology. Do that after you have a clear idea of what your goal is.
If you are just trying to avoid duplicate registrations on your database, just search the users table and show an error if that username is already there.
Also, think of your datamodel carefully before going into the implementation details.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Simplest way to persist data in Azure - recommended options? - asp.net-mvc-4

Related

Checking Whether Table Data Exists, Updating / Inserting Into Two Tables & Posting End Outcome

SSIS Inserting incrementing ID with starting range into multiple tables at a time

Lua - Managing tables... Object Oriented?

Unsolvable events scenario in RavenDB?

How do I efficiently create a TRIGGER which, on occasion, generates a VIEW?

Categories

Resources