Passphrases seem like a good alternative for traditional
guidelines for strong passwords. See http://xkcd.com/936/ for an entertaining take on passwords vs. passphrases.
There are many tools for generating more traditional passwords (eg. pwgen.)
Such tools are useful when for example providing users with good initial passwords.
What tools are available for generating good passphrases?
Do you have experience on using them or insight about their security or other features?
I've recently released a couple of Perl scripts, gen-password and gen-passphrase, on GitHub here.
The gen-passphrase script could suit your needs. It takes three arguments: a word used as a sequence of initials, a minimum length, and a maximum length. For example:
$ gen-passphrase abcde 6 8
acrimony borrowed chasten drifts educable
or you can ask for a number of words without specifying their initials (a new feature I just added):
$ gen-passphrase 5 6 8
poplin outbreak aconites academic azimuths
It requires a word list; it uses /usr/share/dict/words by default if it exists. It uses /dev/urandom by default, but can be told to use /dev/random. See this answer of mine on superuser.com for more information about /dev/urandom vs. /dev/urandom.
NOTE: So far, nobody other than me has tested these scripts. I've made my best effort to have them generate strong passwords/passphrases, but I guarantee nothing.
I wrote a command line based Perl script passphrase-generator.
The passphrase-generator defaults to only 3 words instead of the 4 suggested by XKCD, but uses a larger dictionary found in many linux based systems at /usr/share/dict/words. It also provides estimates for the entropy of the generated passphrases. The randomization is based on /dev/urandom and SHA1.
Example run:
$ passphrase-generator
Random passphrase generator
Entropy per passphrase is 43.2 bits (per word: 14.4 bits.)
For reference, entropy of completely random 8 character (very hard to memorize)
password of upper and lowercase letters plus numbers is 47.6 bits
Entropy of a typical human generated "strong" 8 character password is in the
ballpark of 20 - 30 bits.
Below is a list of 16 passphrases.
Assuming you select one of these based on some non random preference
your new passphrase will have entropy of 39.2 bits.
Note that first letter is always capitalized and spaces are
replaced with '1' to meet password requirements of many systems.
Goatees1maneuver1pods
Aught1fuel1hungers
Flavor1knock1foreman
Holding1holster1smarts
Vitamin1mislead1abhors
Proverbs1lactose1brat
... and so on 10 more
There are also some browser/javascript based tools:
http://preshing.com/20110811/xkcd-password-generator
http://passphra.se/
http://lightsecond.com/passphrase.html
CPAN hosts a Perl module for generating XKCD style passphrases:
http://metacpan.org/pod/Crypt::XkcdPassword
Related
I sometimes use this:
$ perl -e "printf \"%d\", ((~18446744073709551592)+1)"
24
I can't seem to do it with Raku. The best I could get is:
$ raku -e "say +^18446744073709551592"
-18446744073709551593
So: how can I make Raku give me the same answer as Perl ?
Gotta go with (my variant¹ of) Liz's custom op (in her comment below).
sub prefix:<²^>(uint $a) { (+^ $a) + 1 }
say ²^ 18446744073709551592; # 24
My original "semi-educated wild guess"² that turned out to be acceptable to #zentrunix and the basis for Liz's op:
say (+^ my uint $ = 18446744073709551592) + 1; # 24
\o/ It works!³
Footnotes
¹ I flipped the two character op because I wanted to follow the +^ form, have it sub-vocalize as "two's complement", and avoid it looking like ^2.
² One line of thinking was about the particular integer. I saw that 18446744073709551592 is close to 2**64. Another was that integers are limited precision in Perl unless you do something to make them otherwise, whereas in Raku they are arbitrary precision unless you do something to make them otherwise. A third line of thinking came from reading the doc for prefix +^ which says "converts the number to binary using as many bytes as needed" which I interpreted as meaning that the representation is somehow important. Hmm. What if I try an int variable? Overflow. (Of course.) uint? Bingo.
³ I've no idea if this solution is right for the wrong reasons. Or even worse. One thing that's concerning is that uint in Raku is defined to correspond to the largest native unsigned integer size supported by the Raku compiler used to compile the Raku code. (Iirc.) In practice today this means Rakudo and whatever underlying platform is being targeted, and I think that almost certainly means C's uint64_t in almost all cases. I imagine perl has some similar platform dependent definition. So my solution, if it is a reasonable one, is presumably only portable to the degree that the Raku compiler (which in practice today means Rakudo) agrees with the perl binary (which in practice today means P5P's perl) when run on some platform. See also #p6steve's comment below.
'Long-hand' answer:
raku -e 'put ( (18446744073709551592.base(2) - 0b1).comb.map({!$_.Int+0}).join.parse-base(2));'
OR
raku -e 'say 18446744073709551592.base(2).comb.map({!$_.Int+0}).join.parse-base(2) + 1;'
Sample Output: 24
The answers above (should?) implement "Two's-Complement" encoding directly. Neither uses Raku's +^ twos-complement operator. The first one subtracts one from the binary representation, then inverts. The second one inverts first, then adds one. Neither answer feels truly correct, yet the same answer as Perl5 is obtained (24).
Looking at the Raku Docs page, one would conclude that the "twos-complement" of a positive number would be negative, hence it's not clear what the Perl (and now Raku) answers represent. Hopefully the foregoing is somewhat useful.
https://docs.raku.org/routine/+$CIRCUMFLEX_ACCENT
I know that webassembly currently supports a 32 bit architecture, so I am supposing that, like RISCV32, that its base instruction set has instructions which are 32 bit wide (Of course, RISCV32 supports 16-bit compressed instructions and 48-bit ones as well). RISC-V's instructions are interpreted mostly as left-endian (in terms of bit indices).
For example, in RISC-V, we can have an instruction like lui (load upper-immediate to register), that embeds a 20-bit immediate into an instruction, has a 5-bit field to encode the desitination register, and a 7-bit format to specify the opcode. Among other things, the opcode contains two bits at the beginning that connote whether the instruction is compressed or not. This is encoded in the specification, where lui has an LUI opcode.:
RISC-V instructions have a variety of different layouts specified in the specification as well, and for example, the lui instruction takes the "U" format, so we know exactly where the 20-bit field is and where the 5-bit destination register is in the serialization:
What is the bit width of a wasm instruction? What are the possible layouts of a wasm instruction? Are there compressed instruction formats for webassembly, such as 16-bit instructions for very common operations?
If webassembly instructions are variable-width, how is the width of an instruction encoded for the interpreter?
Binary WASM bytecode has variable-length instruction, not fixed-width like a RISC CPU. https://en.wikipedia.org/wiki/WebAssembly#Code_representation has an example.
It's not intended to be executed directly, but rather JITed into native machine code, thus a fixed-width format that would require multiple instructions for some 32 or 64-bit constants would make more work for the JIT optimizer. And would be less compact in the WASM binary format, and more instructions to parse.
Much better for the JIT optimizer to know the ultimate goal is to materialize a whole constant, since some ISAs will be able to do that in one instruction, and others will need it split up in different parts depending on the ISA. e.g. 20:12 for RISC-V, 16:16 for ARM movw/movk or MIPS, or if the constant only has set bits in a narrow region, ARM rotated immediates can maybe still use one instruction. Or AArch64 bit-pattern immediates can materialize a constant like 0x01010101 (or 0x0101010101010101) in a single 32-bit instruction.
TL:DR: Don't make the JIT put the pieces back together before breaking back down into asm that works for the target machine.
And in general, variable-length isn't much of a problem for a stream that will be parsed once by software anyway, not decoded repeatedly by hardware every time through a loop.
Examples
A lot of webassembly instructions take up one byte. For example, the left shift instructions are i32.shl andi64.shl and take single byte opcodes 0x74 and 0x86 without any subsequent values, while the i32.const instruction for example starts with 0x41 and takes from 2 to 6 bytes.
Instruction
Opcode
i32.const
0x41
i64.const
0x42
f32.const
0x43
f64.const
0x44
-
-
i32.shl
0x74
i64.shl
0x86
-
-
i32.eqz
0x45
i32.eq
0x46
i64.eqz
0x50
i64.eq
0x51
And so on. The values here are taken from the MDN website. See the Numeric Instructions.
Encoding Numbers
Some instructions such as the const above require specifying the immediate, which increases the overall size of the instruction. The immediates are encoded in LEB128, and the variant depends on whether the integer is signed or unsigned. Those are normally given in the specification.
LEB128 is roughly this: bits are padded to a multiple of seven, split into groups and the last bit is used to determine whether the end is reached. Those numbers are constrained to their maximum width. Floating point numbers are encoded in IEE-754
The const instructions are followed by the respective literal.
All other numeric instructions are plain opcodes without any immediates.
Source: https://webassembly.github.io/spec/core/binary/instructions.html#numeric-instructions
Wasm instructions are represented with a unique opcode (typically 1 byte, more for newer instruction), followed by the encodings of immediate operands, for instructions that have them. There is no specific length, it depends on both the opcode and the immediate values.
For example:
i32.add is opcode 0x6A with no immediates;
i64.const i is opcode 0x42, followed by a variable-length encoding of i in LEB128 format;
br_table l* ld is opcode 0x0E, followed by a variable-length encoding of the length of l* in LEB128, followed by as many variable-length encodings of the label indices in l*, followed by the variable-length encoding of label index ld.
See the binary grammar in the specification for details. A Wasm decoder is essentially "parsing" the binary input according to this grammar.
Here are some citations from the current specification v2.0 related to the instructions (as "seen" by the specification itself):
some instructions also have static immediate arguments, typically
indices or type annotations, which are part of the instruction itself.
Some instructions are structured in that they bracket nested sequences of instructions.
In relation to the nesting:
Implementations typically impose additional restrictions on a number of aspects of a WebAssembly module or execution
Then, one of the noted implementation limitations is:
the nesting depth of structured control instructions
As the nesting depth of the instructions is not strictly defined by the specification, but its left to the implementation to choose, that means that there is no limit of the instructions length regardless are they encoded as binary or text, as per the specification.
Even if we ignore the structured instructions (as we should not), there are many instructions having vectors as arguments. The vectors length is limited to 2^32-1. If my memory serves me right, there was and an instruction having vector of vectors as an argument.
I am trying to learn how ELF files are structured and probably how to make one manually.
I am working on aarch64 Linux OS, the ELF files I am inspecting are of elf64-littleaarch64 format.
Also I try to learn by myself, however I got stuck with some questions...
When I do xxd code, the first number in each line of the output specifies the address of bytes in the file. But when objdump -D code, the first number is something like 4000b0, however corresponds to 000000b0 in xxd. Why is there a four at the beginning?
In objdump, the bytecode is for example 11000a94, which 'means'
add w20, w20, #2 in assembly. I know, that 11 is the opcode, but what does 000a94 mean? I thought, it should be the parameters, but I am adding the value 2 and can't find the number 2 in it.
If you have a good article to read, or can help me explain this, I will be very grateful!
xxd shows the offset of the bytes within the file on disk. objdump -D shows (tentatively) the address in memory where those bytes will be loaded when the program is run. It is common for them to differ by a round number. In particular, 0x400000 may correspond to one higher-level page table entry; see Why Linux/gnu linker chose address 0x400000? which is for x86-64 but I think ARM64 is similar (haven't checked). It doesn't have anything to do with the fact that 0x40 is ASCII #; that's just a coincidence.
Note that if ASLR is in use, the actual memory address will be randomly chosen every time the program is run, and will not match what objdump shows you, though the difference will still be a multiple of the page size.
Well, I was too fast asking this question, but now, I will answer it too.
40 at the beginning of the addresses in objdump is the hex representation of the char "#", which means "at" and points to an address, very simple!
Little Endian has CPU addresses stored in 5 bits instead of 6 or 8. That means, that I should look for the binary value of the objdump code: 11000a94 --> 10001000000000000101010010100, where it can be divided into [10001][00000000000010][10100][10100] with [opcode][value][first address][second address]
Both answers are wrong, see the accepted answer.
I will still let them here, though
I am trying to recover a password I have not used in a long time.
I know the words used in the passphrase, but I do not remember exactly the character substitutions,
and upper/lower case I have used. I only remember some, and know the possibilities for others.
The passphrase I am trying to recover is 15 characters long.
I have installed John the Ripper (jumbo version 1.9), and I tried to create some rules for character
substitutions I know I have used hoping to quickly generate a wordlist with all possible passphrases
based on my rules.
Let's say my passphrase is password with some character substitutions. If I use this set of rules:
sa#
ss$
so0
soO
I get those results:
p#ssword
pa$$word
passw0rd
passwOrd
When I say I am looking for all possible combinations, I am looking for something lookig more like the following (this list is not exhaustive)
p#ssword
p#$sword
p#$$word
pa$sword
pa$$word
p#ssw0rd
p#$sw0rd
p#$$w0rd
pa$sw0rd
pa$$w0rd
p#sswOrd
p#$swOrd
p#$$wOrd
pa$swOrd
pa$$wOrd
Gathering all rules in one line does not help me achieve my goal, and making one rule (line) with substitution by character position is basically generating my list by hand.
I am now wondering how can I achieve my goal, or, if JtR is the right tool for the job.
I have found a solution that fits my use case. the oNx syntax allows to replace the character at Nth position (zero based) with x.
In addition to that, using brackets allow to apply more than one substitution to the same character. So oN[xy] will yield two passwords with the character at Nth position replaced with x, then y.
For my password example above, the rule needed to achieve my goal would be:
o1[a#] o2[sS$] o3[sS$] o5[oO0]
I hope it helps someone with some old database to unlock )
Lua script:
i=io.read()
print(i)
Command line:
echo -e "sala\x00m" | lua ll.lua
Output:
sala
I want it to print all character from input, similar to this:
salam
in HEX editor:
0000000: 7361 6c61 006d 0a sala.m.
How can I print all character from input?
You tripped over one of the few places where the Lua standard library is still not 8-bit-clean.
Specifically, file reading line-by-line is not embedded-0 proof.
The reason it isn't yet is an unfortunate combination of:
Only standard C90 or equally portable constructs are allowed for the core, which does not provide for efficient 0-clean text parsing.
Every solution discussed to date on the mailinglist under that constraint has considerable overhead.
Embedded 0-bytes in text files are quite rare.
Workarounds:
Use a modified library, fixing these formats: "*l" "*L" for file:read(...)
parse your raw data yourself. (read a block using a number or as much as possible using "*a")
Badger the Lua developers/maintainers for a bugfix until they give in.