My question concerns keychains in iOS (iPhone, iPad, ...). I think (but am not sure) that the implementation of keychains under Mac OS X raises the same question with the same answer.
iOS provides five types (classes) of keychain items. You must chose one of those five values for the key kSecClass to determine the type:
kSecClassGenericPassword used to store a generic password
kSecClassInternetPassword used to store an internet password
kSecClassCertificate used to store a certificate
kSecClassKey used to store a kryptographic key
kSecClassIdentity used to store an identity (certificate + private key)
After long time of reading apples documentation, blogs and forum-entries, I found out that a keychain item of type kSecClassGenericPassword gets its uniqueness from the attributes kSecAttrAccessGroup, kSecAttrAccount and kSecAttrService.
If those three attributes in request 1 are the same as in request 2, then you receive the same generic password keychain item, regardless of any other attributes. If one (or two or all) of this attributes changes its value, then you get different items.
But kSecAttrService is only available for items of type kSecClassGenericPassword, so it can't be part of the "unique key" of an item of any other type, and there seems to be no documentation that points out clearly which attributes uniquely determine a keychain item.
The sample code in the class "KeychainItemWrapper" of "GenericKeychain" uses the attribute kSecAttrGeneric to make an item unique, but this is a bug. The two entries in this example only are stored as two distinct entries, because their kSecAttrAccessGroup is different (one has the access group set, the other lets it free). If you try to add a 2nd password without an access group, using Apple's KeychainItemWrapper, you will fail.
So, please, answer my questions:
Is it true, that the combination of kSecAttrAccessGroup, kSecAttrAccount and kSecAttrService is the "unique key" of a keychain item whose kSecClass is kSecClassGenericPassword?
Which attributes makes a keychain item unique if its kSecClass is not kSecClassGenericPassword?
The primary keys are as follows (derived from open source files from Apple, see Schema.m4, KeySchema.m4 and SecItem.cpp):
For a keychain item of class kSecClassGenericPassword, the primary key is the combination of
kSecAttrAccount and kSecAttrService.
For a keychain item of class kSecClassInternetPassword, the primary key is the combination of kSecAttrAccount, kSecAttrSecurityDomain, kSecAttrServer, kSecAttrProtocol, kSecAttrAuthenticationType, kSecAttrPort and kSecAttrPath.
For a keychain item of class kSecClassCertificate, the primary key is the combination of kSecAttrCertificateType, kSecAttrIssuer and kSecAttrSerialNumber.
For a keychain item of class kSecClassKey, the primary key is the combination of kSecAttrApplicationLabel, kSecAttrApplicationTag, kSecAttrKeyType,
kSecAttrKeySizeInBits, kSecAttrEffectiveKeySize, and the creator, start date and end date which are not exposed by SecItem yet.
For a keychain item of class kSecClassIdentity I haven't found info on the primary key fields in the open source files, but as an identity is the combination of a private key and a certificate, I assume the primary key is the combination of the primary key fields for kSecClassKey and kSecClassCertificate.
As each keychain item belongs to a keychain access group, it feels like the keychain access group (field kSecAttrAccessGroup) is an added field to all these primary keys.
I was hitting a bug the other day (on iOS 7.1) that is related to this question. I was using SecItemCopyMatching to read a kSecClassGenericPassword item and it kept returning errSecItemNotFound (-25300) even though kSecAttrAccessGroup, kSecAttrAccount and kSecAttrService were all matching the item in the keychain.
Eventually I figured out that kSecAttrAccessible didn't match. The value in the keychain held pdmn = dk (kSecAttrAccessibleAlways), but I was using kSecAttrAccessibleWhenUnlocked.
Of course this value is not needed in the first place for SecItemCopyMatching, but the OSStatus was not errSecParam nor errSecBadReq but just errSecItemNotFound (-25300) which made it a bit tricky to find.
For SecItemUpdate I have experienced the same issue but in this method even using the same kSecAttrAccessible in the query parameter didn't work. Only completely removing this attribute fixed it.
I hope this comment will save few precious debugging moments for some of you.
Answer given by #Tammo Freese seems to be correct (but not mentioning all primary keys). I was searching for some proof in the documentation. Finally found:
Apple Documentation mentioning primary keys for each class of secret (quote below):
The system considers an item to be a duplicate for a given keychain when that keychain already has an item of the same class with the same set of composite primary keys. Each class of keychain item has a different set of primary keys, although a few attributes are used in common across all classes. In particular, where applicable, kSecAttrSynchronizable and kSecAttrAccessGroup are part of the set of primary keys. The additional per-class primary keys are listed below:
For generic passwords, the primary keys include kSecAttrAccount and
kSecAttrService.
For internet passwords, the primary keys include kSecAttrAccount,
kSecAttrSecurityDomain, kSecAttrServer, kSecAttrProtocol,
kSecAttrAuthenticationType, kSecAttrPort, and kSecAttrPath.
For certificates, the primary keys include kSecAttrCertificateType,
kSecAttrIssuer, and kSecAttrSerialNumber.
For key items, the primary keys include kSecAttrKeyClass,
kSecAttrKeyType, kSecAttrApplicationLabel, kSecAttrApplicationTag,
kSecAttrKeySizeInBits, and kSecAttrEffectiveKeySize.
For identity items, which are a certificate and a private key bundled
together, the primary keys are the same as for a certificate. Because
a private key may be certified more than once, the uniqueness of the
certificate determines that of the identity.
Here is another piece of useful information about uniqueness of a keychain item, found in the "Ensure Searchability" section of this Apple docs page.
To be able to find the item later, you’re going to use your knowledge of its attributes. In this example, the server and the account are the item’s distinguishing characteristics. For constant attributes (here, the server), use the same value during lookup. In contrast, the account attribute is dynamic, because it holds a value provided by the user at runtime. As long as your app never adds similar items with varying attributes (such as passwords for different accounts on the same server), you can omit these dynamic attributes as search parameters and instead retrieve them along with the item. As a result, when you look up the password, you also get the corresponding username.
If your app does add items with varying dynamic attributes, you’ll need a way to choose among them during retrieval. One option is to record information about the items in another way. For example, if you keep records of users in a Core Data model, you store the username there after using keychain services to store the password field. Later, you use the user name pulled from your data model to condition the search for the password.
In other cases, it may make sense to further characterize the item by adding more attributes. For example, you might include the kSecAttrLabel attribute in the original add query, providing a string that marks the item for the particular purpose. Then you’ll be able to use this attribute to narrow your search later.
Item of class kSecClassInternetPassword was used in the example, but there is a note that says:
Keychain services also offers the related kSecClassGenericPassword item class. Generic passwords are similar in most respects to Internet passwords, but they lack certain attributes specific to remote access (for example, they don’t have a kSecAttrServer attribute). When you don’t need these extra attributes, use a generic password instead.
Related
I am a bit confused with the partial keys. 'Database System Concepts by Korth' says the following:
Although the weak entity set does not have a primary key, we
nevertheless need a means of distinguishing among all those entities in
the weak entity set that depend on one particular strong entity. The
discriminator of a weak entity set is a set of attributes that allows
this distinction to be made. The discriminator of a weak entity set is
also called the partial key of the entity set.
My confusion is that if the discriminator/partial keys of weak entities are able to uniquely identify the set of attributes, then it should be called primary key, instead of partial keys, as primary keys are those which can uniquely identify all the attributes of a relation.
Also, while surfing the web, I came across a definition of partial key, which says:
'A partial key is a key using which all the records of the table can not be identified uniquely'
It raises a question in my mind, that suppose if a table consists of a primary key which is made up of two or more attributes, then if we pick up a single attribute from this, then will it be called partial key, as that attribute is part of a primary key, but by itself it can't uniquely identify all attributes in a relation.
The definition doesn't say that "the discriminator/partial keys of weak entities are able to uniquely identify" within a table. It says that one identifies a weak entity within a particular strong entity.
Technical terms only mean what they are defined to mean in a certain context of assumptions, including other definitions. You can't expect the same term to mean the same thing everywhere. You can't just look at the text of a definition & make assumptions about what situations it applies to & what its technical terms mean or even whether a word is used in a technical or everyday meaning. When someone uses a term you have to make sure that you know what they mean by it.
A relational superkey uniquely identifies a row. A CK (candidate key) is a superkey that contains no smaller superkey. A PK (primary key) is just some CK you decided to call the PK.) So being unique is not a reason to call something a PK or CK. (An SQL PK/UNIQUE is analogous to a relational superkey.)
The book method generates discriminators that are not superkeys. So we can say that it agrees with the web definition--for cases that come up in that method. But if a method allowed generation of discriminators that were CKs or PKs then its use of that textbook wording would define "partial key" to be a different sort of thing than the web definition. Such a method couldn't use (relational) "PK" for a strong id plus discriminator, because it would be a superkey but not a CK or PK. (But it could still use SQL "PK" since that approximately means primary superkey.)
I really think this type of descriptions stems from the very first step in any modelling process, and one which anyone with any data modelling experience would just fix without even thinking about it.
The wiki page on "Weak Entity" gives the classic example of a Header/Detail pair, where the detail by itself doesn't have a reference to the header. Think of a two page document where page one is the header, page two is the details.
By itself, page two can not uniquely identify a row, but of course anyone would automatically add the header FK so we can uniquely identify a row.
Haven't seen the book you are reading, but I think that's what its getting at. So I think all your subsequent reasoning is correct. Have a look at the wiki page for more info.
We have a few IBM Notes databases here, at least one hundred I think, and if we have to identify a user we are using the given Name at the moment. We are also connecting this with a database of all the employees here, using it to do time-management and administrative stuff.
Therefore we need to determine which user is which, as I said we are doing that by the name at the moment. But names change, so now we would like to change to a not changing ID. I thought we could use the key identifier, or one of them at least. So my question is, is there a way to get it through Lotus Script? If not, is there another way to identify the user of a certain key-file?
Lotus Notes and Domino do not have any builtin unique key identifier for users. It was never part of the design. You can't use the noteid of the Person document because that varies from one replica of the Domino Directory to another, and you should not use the unid, because although that's stable across replicas it can still change if you have to recreate the Person document, which you might have to do if the employee leaves your company and then comes back, or if the Person document is damaged.
The way most large organizations deal with this is to set the EmployeeID field in the Person document and use that as the unique identifier. Some organizations might also create unique identifiers and use them for the ShortName.
Whilst I don't diasagree with Richards answer in general, it is possible to 'capture' a Document-ID (UNID) in a separate 'created when composed' field. This will then hold a static 'unique' 32 character reference. Once this has been set, it should not change unless you have a process to change it. A UNID is based on/derived from a time-stamp so they are extremely unlikely to be reused or clash with future system generated unids, even after many years or many millions of docs being used. This captured value may or may not agree with the actual system assigned unid, but this generally doesn't matter.
If you are unhappy with a copy of an system unid, then this could be used on a temporary basis until it can be overridden with an external ID reference. Alternatively, use a proven external 'guid' reference and assign that to the records you want to track.
If the raw unid creates side issues for you, use #password to hash it to another unique value. This generates a quick/old MD5 style hash, but good enough for reference-IDs.
One other point to mention is that keeping previous names in a list is also feasible, so that matching an 'old name' is still viable. Either create an additional field in the person doc (admins may hate you for doing this) or add older names to the 'fullname' field list. This usually contains a list of name variations. If you add the older name to the bottom of the list (ie: append) the you and the system can match against this name (for logins and routing) but Notes only ever uses the first name in the list as the users 'official' name (for reader/author fields etc), which should be the current name. To find the name, just lookup the name in the ($Users) view if using the FullName field or create a new view in using your own field (again Admins may hate you more).
I am trying to read up on best practices on DynamoDB. I saw that DynamoDB has two PK types:
Hash Key
Hash and Range Key
From what I read, it appears the latter is like the former but supports sorting and indexing of a finite set of columns.
So my question is why ever use only a hash key without a range key? Is it a viable choice only when the table is not searched?
It'd also be great to have some general guidelines on when to use what key type. I've read several guides (including Amazon's own documentation on DynamoDB) but none of them appear to directly address this question.
Thanks
The choice of which key to use comes down to your Use Cases and Data Requirements for a particular scenario. For example, if you are storing User Session Data it might not make much sense using the Range Key since each record could be referenced by a GUID and accessed directly with no grouping requirements. In general terms once you know the Session Id you just get the specific item querying by the key. Another example could be storing User Account or Profile data, each user has his own and you most likely will access it directly (by User Id or something else).
However, if you are storing Order Items then the Range Key makes much more sense since you probably want to retrieve the items grouped by their Order.
In terms of the Data Model, the Hash Key allows you to uniquely identify a record from your table, and the Range Key can be optionally used to group and sort several records that are usually retrieved together. Example: If you are defining an Aggregate to store Order Items, the Order Id could be your Hash Key, and the OrderItemId the Range Key. Whenever you would like to search the Order Items from a particular Order, you just query by the Hash Key (Order Id), and you will get all your order items.
You can find below a formal definition for the use of these two keys:
"Composite Hash Key with Range Key allows the developer to create a
primary key that is the composite of two attributes, a 'hash
attribute' and a 'range attribute.' When querying against a composite
key, the hash attribute needs to be uniquely matched but a range
operation can be specified for the range attribute: e.g. all orders
from Werner in the past 24 hours, or all games played by an individual
player in the past 24 hours." [VOGELS]
So the Range Key adds a grouping capability to the Data Model, however, the use of these two keys also have an implication on the Storage Model:
"Dynamo uses consistent hashing to partition its key space across its
replicas and to ensure uniform load distribution. A uniform key
distribution can help us achieve uniform load distribution assuming
the access distribution of keys is not highly skewed."
[DDB-SOSP2007]
Not only the Hash Key allows to uniquely identify the record, but also is the mechanism to ensure load distribution. The Range Key (when used) helps to indicate the records that will be mostly retrieved together, therefore, the storage can also be optimized for such need.
Choosing the correct keys to represent your data is one of the most critical aspects during your design process, and it directly impacts how much your application will perform, scale and cost.
Footnotes:
The Data Model is the model through which we perceive and manipulate our data. It describes how we interact with the data in the database [FOWLER]. In other words, it is how you abstract your data model, the way you group your entities, the attributes that you choose as primary keys, etc
The Storage Model describes how the database stores and manipulates the data internally [FOWLER]. Although you cannot control this directly, you can certainly optimize how the data is retrieved or written by knowing how the database works internally.
I read somewhere that it is bad to use your db table's primary key as a public identifier online. However, I would like my users to link to a specific object in the table.
How do I create a unique identifier column to my table that is non-related to the primary key (which is a auto-increment integer)?
My initial idea is to use a php script to generate random hexadecimal values of suitable length (there will be about 100 000-200 000 items i the table at most I think) and then inserting them. But then I don't know if it would be unique...
You can use a GUID (Globally Unique IDentifier) to uniquely identify a record. The number of possible GUIDs is so high the chances of duplicating one is next to nothing. Similarly, the chances of someone guessing the GUID is so low that generally they are safe to display to the user (for example www.yoursite.com?id=21EC20203AEA1069A2DD08002B30309D).
If you're using php you can use the com_create_guid method. *Note: This method is only supported in PHP5. For PHP4, look at uniqueid.
I am creating the model for a web application. The tables have ID fields as primary keys. My question is whether one should define ID as a property of the class?
I am divided on the issue because it is not clear to me whether I should treat the object as a representation of the table structure or whether I should regard the table as a means to persist the object.
If I take the former route then ID becomes a property because it is part of the structure of the database table, however if I take the latter approach then ID could be viewed as a peice of metadata belonging to the database which is not strictly a part of the objects model.
And then we arrive at the middle ground. While the ID is not really a part of the object I'm trying to model, I do realise that the the objects are retrieved from and persisted to the database, and that the ID of an object in the database is critical to many operations of the system so it might be advantageous to include it to ease interactions where an ID is used.
I'm a solo developer, so I'd really like some other, probably more experienced perspectives on the issue
Basically: yes.
All the persistence frameworks ive used (including Hibernate, Ibatis) do require the ID to be on the Object.
I understand your point about metadata, but an Object from a database should really derive its identity in the same way the database does - usually an int primary key. Then Object-level equality should be derived from that.
Sometimes you have primary keys that are composite, e.g first name and last name (don't ever do this!), in which cases the primary key doesn't become 'metadata' because it is part of the Object's identity.
I generally reserve the ID column of an object for the database. My opinion is that to use it for any 'customer-facing' purpose, (for example, use the primary key ID as a customer number) you will always shoot yourself in the foot later.
If you ever make changes to the existing data (instead of exclusively adding new data), you need the PK. Otherwise you don't know which record to change in the DB.
You should have the ID in the object. It is essential.
The easiest use case to give as an example is testing equality:
public bool Equals(Object a, Object b) { return {a.ID = b.ID}; }
Anything else is subject to errors, and you'll find that out when you start getting primary key violations or start overwriting existing data.
By counterargument:
Say you don't have the ID in the object. Once you change an object, and don't have it's ID from the database, how will you know which record to update?
At the same time, you should note that the operations I mention are really private to the object instance, so ID does not necessarily have to be a public property.
I include the ID as a property. Having a simple unique identifier for an object is often very handy regardless of whether the object is persisted in a database or not. It also makes your database queries much more simple.
I would say that the table is just a means to persist an object, but that doesn't mean the object can't have an ID.
I'm very much of the mindset that the table is a means to persist the object, but, even so, I always expose the IDs on my objects for two primary reasons:
The database ID is the most convenient way to uniquely identify an object, either within a class (if you're using a per-table serial/autonumber ID) or universally (if you're maintaining a separate "ID-to-class" mapping). In the context of web applications, it makes everything much simpler and more efficient if your forms are able to just specify <input type=hidden name=id value=12345> instead of having to provide multiple fields which collectively contain sufficient information to identify the target object (or, worse, use some scheme to concatenate enough identifying information into a single string, then break it back down when the form is submitted).
It needs to have an ID anyhow in order to maintain a sane database structure and there's no reason not to expose it.
Should the ID in the object read-only or not? In my mind it should be read-only as by definition the ID will never change (as it uniquely identifies a record in the database).
This creates a problem when you create a new object (ID not set yet), save it in the database through a stored procedure which returns the newly created ID then how do you store it back in the object if the ID property is read-only?
Example:
Employee employee = new Employee();
employee.FirstName="John";
employee.LastName="Smith";
EmployeeDAL.Save(employee);
How does the Save method (which actually connects to the database to save the new employee) update the EmployeeId property in the Employee object if this property is read-only (which should be as the EmployeeId will never ever change once it's created).