Zero knowledge? proof item association - cryptography

I'm a total noob when it comes to cryptography but I believe this falls under the "zero knowledge" category.
I have two associated pieces of information:
tag - Known by both parties. Unique per scenario.
identity - Known by only one party. Potentially associated to multiple tags. Comes from a pool known by both parties.
I need a way to prevent the party with the association to change the value of the identity. There are around one hundred concurrent associations per scenario. The pool of potential identities can be relatively small, even smaller than the number of tags.
The most primitive option would be to hash the tag and identity together but with such a small pool of potential identities I fear it would be trivial to brute force the hash...
During the scenario more and more of these associations will become public. At least at that point I should be able to confirm that the other party did not modify the association. I don't really have to confirm this before then because unrevealed associations are not relevant. I just need to prevent the knowing party to pick and choose on revealing.
Is such a thing even possible? How could it be done? How difficult would it be to implement?

You shouldn't implement this yourself.
you have two parties, Alice and Bob. ALice as the (identity, tag) pair. Bob only has the tag.
Alice wants to prove to Bob she has an identity in that pair that she did not change but she does not want to reveal that identity to BoB.
What you want is a "Signature scheme with efficient protocols". I know of no API's that expose this functionality. However, these are widely used in anonymous credential systems that can be used for your purposes.
Thankfully, there are two systems that support this type of thing. One is IBM Idemix which uses the above technique and is where you should look first. The other is Microsoft's U-Prove.

Related

Is it secure to display/embed user_id that represents each user from the db in the html of a page?

If I display user_id that represents each unique user in the db as an atrribute in an HTML element, is that good practice? Because I need the reference to the user if I want to perform an action on that particular user such as adding him to be my friend.
Example in HTML:
<div data-user-id='12' onclick=addFriend(12)>
Click to add John as your friend
</div>
Where 12 is John's actual user id in the db. From a security perspective, is it secure to do this?
it's never a problem to display the user id, actually, it's more secure than showing the username which could be used for logins, but a better solution is to display an id that could be set or changed by the user himself, look for facebook design for reference.
In this case, you want the user to be able to set his public id, and you use this public id to identify the user externally, then you map it to the actual user id internally in the back end.
anyway, all of this is not relevant for an abstract case, to decide how secure it's you need to consider the other security design elements of your application, the main question is always what can a malicious user do by knowing the actual user id?
As usual....it's complicated.
It all depends on how attractive your site is to hackers (and therefore how much effort they're going to invest), and how secure the rest of your solution is.
The first step in an organized attack is to find out as much as possible about your website. Your current solution leaks information. Knowing that users are identified by an integer may be useful (some database engines are more likely to use integers rather than GUIDs). It may help attackers guess other keys. By guessing sequential user IDs, they can find out how many users you have.
Once the attacker has found out all this information, they will use it to try and penetrate your application. The more information they have, the easier it is to create a plan. An individual piece of information may not be useful, but when you put it together with other snippets, it may reveal something useful.
So, no, there's no obvious major risk in this design by itself. It may be part of a wider attack, though, and it could be the bit of information that exposes some other flaw.

Should an API assign and return a reference number for newly created resources?

I am building a RESTful API where users may create resources on my server using post requests, and later reference them via get requests, etc. One thing I've had trouble deciding on is what IDs the clients should have. I know that there are many ways to do what I'm trying to accomplish, but I'd like to go with a design which follows industry conventions and best design practices.
Should my API decide on the ID for each newly created resource (it would most likely be the primary key for the resource assigned by the database)? Or should I allow users to assign their own reference numbers to their resources?
If I do assign a reference number to each new resource, how should this be returned to the client? The API has some endpoints which allow for bulk item creation, so I would need to list out all of the newly created resources on every response?
I'm conflicted because allowing the user to specify their own IDs is obviously a can of worms - I'd need to verify each ID hasn't been taken, makes database queries a lot weirder as I'd be joining on reference# and userID rather than foreign key. On the other hand, if I assign IDs to each resource it requires clients to have to build some type of response parser and forces them to follow my imposed conventions.
Why not do both? Let the user create there reference and you create your own uid. If the users have to login then you can use there reference and userid unique key. I would also give the uid created back if not needed the client could ignore it.
It wasn't practical (for me) to develop both of the above methods into my application, so I took a leap of faith and allowed the user to choose their own IDs. I quickly found that this complicated development so much that it would have added weeks to my development time, and resulted in much more complex and slow DB queries. So, early on in the project I went back and made it so that I just assign IDs for all created resources.
Life is simple now.
Other popular APIs that I looked at, such as the Instagram API, also assign IDs to certain created resources, which is especially important if you have millions of users who can interact with each-other's resources.

Can a client ever reliably generate a PK for an object being written to a DB?

I've been fussing with this dilemma for a while now and I thought I'd approach SO.
A bit of background on my scenario:
I have Playlists which contain 0 or more PlaylistItem children.
Playlists are created infrequently. As such, I am OK with requesting a GUID from the server, waiting for a response and, on success, refreshing the UI to show the successfully added Playlist.
PlaylistItem objects are created frequently. As such, I am NOT OK with a loading message while I wait for the server to respond with a UUID.
This last fact is an optimization, I know, but I think it greatly improves the usability of the program.
Nevertheless, I would like to discuss my options for uniquely identifying my object client-side. I'll first highlight the two options I have tried and failed with, followed by a third option I am considering. I would love some insight into other possible solutions.
Generating a PK UUID client-side which will be persisted to the server.
This was my first choice. It was an obvious decision, but has some clear shortcomings. The first issue here is that client-side UUIDs can't and shouldn't be trusted for this sort of purpose. A malicious user can force PK collisions with ease. Furthermore, my understanding is that I should expect a greater collision chance if I chose to generate UUIDs client-side. Scratching that.
Generate a composite PK based on Playlist GUID and position in Playlist
I thought that this was a tricky, but great solution to my issue. A PlaylistItem's position is unique to a given Playlist collection and it is derivable both client-side and server-side. This seemed like a great fix. Unfortunately, having my position be part of the PK breaks the immutability of my PK. Whenever a PlaylistItem is reordered or deleted -- a large amount of PlaylistItem keys would need to be updated. Scratching that.
Generating a composite PK based on Playlist GUID and an auto-increment PlaylistItem ID
This solution is similar to the one above, but ensures that the PK is immutable by separating the composite key from the position. This is the current solution I am toying with. My only concern is that a malicious user could force collisions by modifying the auto-incremented id of the client before sending along. I don't think that this sort of malicious act would cause any harm to the system, but something to consider.
Okay! There you have it. Am I being stupid for doing all of this? Do I just suck it up and force my server to generate the GUIDs for my PlaylistItem objects? Or, is it possible to write a proper implementation?
UPDATE:
I am hoping to represent the user's action visually before the server has successfully saved to the database and implement the needed recovery techniques if the save fails. I am unsure if this is fool-hardy, but I will explain my reasoning through a use case scenario:
The client would like to add a new PlaylistItem. To do so, a request to YouTube's API is made for all the necessary information to create a PlaylistItem. The client has all necessary information to create a PlaylistItem after YouTube's API has responded, except for the ability to uniquely identify it.
At this point, the user has already waited X timeframe for YouTube's API. Now, I would like to visually show the PlaylistItem on the client. If I opt to wait for the server, I am now waiting X + Y timeframe before there is a visual indication of success. In testing, this delay felt awkward.
My server is just a micro instance on Amazon's EC2. I could reduce Y timeframe by upgrading hardware, but I could eliminate Y completely with clever programming. This is the dilemma I am facing.
Okay, as you seemed to like it when I suggested it in a comment :)
You could use a high/low approach, which basically allows a client to reserve a bunch of keys at a time. The simplest way would probably be to make it a composite primary key, consisting of two integers. The client would have one call along the lines of "give me a major key". You'd autoincrement the "next major key" sequence, and record which client "owns" that major key. That client can then use any minor key alongside that major key, and know that they'll be isolated from any other clients.
When the client performs an insert, you can check that the client is using the right major key, i.e. one assigned to them.
Of course, an alternative way of approach this would be to just make the primary key { client ID, UUID } and let the client just specify any UUID...

Unique identifiers for each resource in RESTful API?

In an ideal RESTful API that supports multiple accounts, should each resource have it's unique identifier across the entire system, or it is OK if that identifier is unique for the specific account that it belongs to.
Are there any pros and cons for each scenario?
To give an example.
Would this be fine from the REST principles?
http://api.example.com/account/1/users/1
...
http://api.example.com/account/50/users/1
or would this approach be recommended?
http://api.example.com/account/1/users/{UNIQUE_IDENTIFIER}
...
http://api.example.com/account/50/users/{ANOTHER_UNIQUE_IDENTIFIER}
You reveal valid user numbers by always having the first user as 1. Someone then knows that any account will also have a user 1. I'm not saying that you should hide user IDs just through obscurity but why make it easy for someone to find the user IDs in another account?
All that really matters is that each resource has a unique identifier. Both of your examples accomplish that, so you seem to be okay (RESTfully speaking)
I don't see any compelling reason to use one over the other. I'd choose whatever makes more sense for your implementation.
Since, from the perspective of an external system using your REST API, the entire address should be considered to be the "identifier" for that resource object, so your first example is fine.

Should I have one email/user account for 3rd party APIs or Individual ones for each?

Is there a best practice for using email/user accounts for 3rd part APIs in a business scenario?
For example say my company domain is foo.com, and I need to access data from Flicker, youtube, twitter, facebook, jigsaw, Amazon, ebay, and many others.
Should I have seperate email addresses/user names like flickerapi#foo.com,youtubeaip#foo.com, facebookapi#foo.com or something like apiuser#foo.com and have a consitent username used across services if they require a seperate user name? What do you do? Are there any disadvantages or advantages to one or the other? The obvious disadvantage to me of multiple would be remembering all the email addresses.
There are many facets to the answer for this question, and I dont think there is obviously any single superior way.
To be safe you should plan on having multiple, just in case the one you are trying to reserve is already taken (its rare, but it happens). That way you can plan on using a single one but you are prepared if something in your design has to change.
The rest is about visibility, and how risk-averse you want to be. Having one account per service means that if one is compromised (password is discovered, etc) its the only one affected (assuming you use different credentials for each). The downside is that its very obvious these all point to the same place (not necessarily bad) and abuse of one could lead to problems in other places.
Having multiple accounts mitigates some of this, but you have other headaches, such as multiple passwords, managing multiple expiration processes, and auditing to make sure the accounts all still work, etc.