MD5 and HMAC-MD5
This subprocedure is a generic function to apply the MD5 hash algorithm over arbitrary data. MD5 is described in RFC 1321 The MD5 Message-Digest Algorithm: http://www.faqs.org/rfcs/rfc1321.html.
See the follow-ups section for HMAC-MD5 code.
In summary, the MD5 takes arbitrary data as input, and creates a unique 128-bit fingerprint (or “message digest”) of the data. In most instances, this fingerprint is represented by a 32-position alphanumeric value. This is a one-way conversion; it is presumed “computationally infeasible” to derive the original data from the fingerprint.
The MD5 algorithm is deterministic; that is, the same input will always result in the same output fingerprint.
The following tests determine whether the MD5 algorithm is correctly implemented. (When testing, the input is all on one line with no carriage returns.)
␢ → (space symbol) → d41d8cd98f00b204e9800998ecf8427e
a → 0cc175b9c0f1b6a831c399e269772661
abc → 900150983cd24fb0d6963f7d28e17f72
message digest → f96b697d7cb7938d525a2f31aaf161d0
abcdefghijklmnopqrstuvwxyz → c3fcd3d76192e4007dfb496cca67e13b
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 → d174ab98d277d9f5a5611c2c9f419d9f
12345678901234567890123456789012345678901234567890123456789012345678901234567890 → 57edf4a22be3c955ac49da2e2107b67a
I have tested this subprocedure and test program on a V4R5 system. I believe it will run on any V4 and V5 systems, and probably V3 systems with the ILE RPG compiler. If you are running a non-V4R5 system, please give this program a try and let me know if it works.
The MD5 procedure below takes two inputs:
- A pointer to the data to be hashed. The data may be anything on the system that a single pointer can reference. (If this is not so, please let me know. Thanks!)
- Length of the data to be hashed. This can be any integer data type.
And provides the following output:
- MD5 fingerprint of the data provided. The variable receiving the MD5 fingerprint should be defined as a character field of length 32.
The function encapsulates the _CIPHER MI instruction, which has routines for a variety of encryption techniques, including MD5. Also provides helper functions to convert the resultant MD5 fingerprint into corresponding ASCII-equivalent characters for interoperability between AS/400 and other systems.
d MD5 pr 32a
d inputdata * value
d inputlength 10i 0 const
p MD5 b
*-- Procedure interface
d MD5 pi 32a
d inputdata * value
d inputlength 10i 0 const
*-- Prototypes to MI procedures used herein
*-- Cipher MI procedure: interface to multiple encryption algorithms
d cipher pr extproc('_CIPHER')
d * value
d * value
d * value
*-- Convert data between code pages
d convert pr extproc('_XLATEB')
d * value
d * value
d 10u 0 value
*-- Convert hex nibbles to character equivalent
d cvthc pr extproc('cvthc')
d 1a
d 1a
d 10i 0 value
*-- Constants
d upper c 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
d lower c 'abcdefghijklmnopqrstuvwxyz'
*-- Variables for _CIPHER MI
d hashworkarea s 96a inz(*allx'00')
d receiverhex s 16a inz
d receiverptr s * inz(%addr(receiverhex))
*-- Variables for output
d receiverchr s 32a
*-- Variables for QTQCVRT API
d startmap s 256a inz
d to819 s 256a inz
d ccsid1 s 10i 0 inz(37)
d st1 s 10i 0 inz(0)
d l1 s 10i 0 inz(%size(startmap))
d ccsid2 s 10i 0 inz(819)
d st2 s 10i 0 inz(0)
d gccasn s 10i 0 inz(0)
d l2 s 10i 0 inz(%size(to819))
d l3 s 10i 0 inz
d l4 s 10i 0 inz
d fb s 12a inz
*-- Data structures
*-- Control data structure for _CIPHER MI
d controls ds
d function 5i 0 inz(5)
d hashalg 1a inz(x'00')
d sequence 1a inz(x'00')
d datalength 10i 0 inz(15)
d unused 8a inz(*loval)
d hashctxptr * inz(%addr(hashworkarea))
*-- Data structure for EBCDIC/ASCII prep
d ds
d x 5i 0
d lowx 2 2
*-------------------------------------------------------------------------
*-- Main
*-- Get all single byte ebcdic hex values
c 0 do 255 x
c eval %subst(startmap:x+1:1) = lowx
c enddo
*-- Get conversion table: ccsid 37 to 819
c call 'QTQCVRT'
c parm ccsid1
c parm st1
c parm startmap
c parm l1
c parm ccsid2
c parm st2
c parm gccasn
c parm l2
c parm to819
c parm l3
c parm l4
c parm fb
*-- Put length of data into _CIPHER control structure
c eval datalength = inputlength
*-- Change message to ccsid 819 (ascii). Do this since we will probably
*-- interact with other systems expecting ASCII data. Convert the input
*-- data to its ASCII equivalent to get an ASCII-equivalent MD5.
c callp convert(inputdata : %addr(to819) :
c inputlength)
*-- Prep hash work area for _CIPHER use
c eval hashworkarea = *allx'00'
c eval hashctxptr = %addr(hashworkarea)
*-- Get MD5 fingerprint of data
c callp(e) cipher( %addr(receiverptr) : %addr(controls)
c : %addr(inputdata) )
c if %error
c eval receiverchr = *blanks
c else
c callp cvthc( receiverchr : receiverhex :
c %size(receiverchr) )
c endif
*-- In the *NIX and PC world, letters found in MD5 are in lower case.
*-- Most everything in the AS/400 world is uppercase. Convert to lower
*-- case to make direct comparisons easier for the caller.
c UPPER:LOWER xlate receiverchr receiverchr
c return receiverchr
p MD5 e
The following sample program allows you to preview MD5 encodings of various inputs from the AS/400 command line. A version could be used in a batch CL program by retrieving the program message and extracting the substitution variable.
/* Compile with PGM(MD5IC) */
cmd prompt('MD5 command line')
parm kwd(text) type(*char) len(512) prompt('Text to hash') +
vary(*yes *int4) case(*mixed)
pgm (&text)
dcl var(&text) type(*char) len(516)
dcl var(&md5enc) type(*char) len(32)
call pgm(md5ie) parm(&text &md5enc)
sndpgmmsg msgid(cpf9897) msgf(qcpfmsg) +
msgdta('The MD5 hash is: ' *cat &md5enc)
endpgm
h dftactgrp(*no) bnddir('QC2LE')
*-- Change the below to wherever you saved the MD5 prototype
/copy toolssrc/prototypes,md5
d intext ds 516
d length 10i 0 overlay(intext:1)
d data 512 overlay(intext:5)
d mymd5 s 32a
c *entry plist
c parm intext
c parm mymd5
c eval mymd5 = md5( %addr(data) : length )
c eval *inlr = *on
c return
*-- Change the below to wherever you saved the MD5 subprocedure
/copy toolssrc/procedures,md5
Follow-ups
Thanks for posting this code. I compiled it, it works. But I’m having trouble changing it so I can call it from an rpgle application. How would you change the CL? Maybe it’s the fact that inputdata is a pointer that is confusing me.
Calling MD5 from an RPGLE program should be identical to the example in MD5IE. The first parameter, the input data, is not actually the text you want to hash, but a pointer to the text. Normally, you pass the address of the data. I believe you could also use a basing pointer, but I haven’t tried it. Here is a sample from an actual program I use.
d md5data s 100a inz
d graphkey s 32a inz
c eval md5data = %trim(Program) + '|' +
c %trim(Parameters)
c eval graphkey = MD5( %addr(md5data) :
c %len(%trimr(md5data)) )
How is HMAC-MD5 different from MD5? I’m trying to send transactions to our merchant account. They require a HMAC-MD5 hash of the data sent to them. I tried using your sample code, but our hash values do not match. So apparently I’m missing exactly how to implement my Transaction key with the MD5 hash. Do you know how I could implement the HMAC part of this in RPG?
HMAC-MD5 is a “keyed” MD5 checksum used to help prevent attacks on regular MD5 data. The RFC with C code and test cases can be found at http://www.faqs.org/rfcs/rfc2104.html.
For HMAC, you are given a key K and data text.
From the RFC, the HMAC-MD5 algorithm is md5(K XOR opad, md5(K XOR ipad, text)) where:
- K is an n byte key
- ipad is the ASCII byte 0x36 repeated 64 times
- opad is the ASCII byte 0x5c repeated 64 times
- text is the data being protected
There are additional checks in the sample code (such as md5(K) when K is longer than 64 bytes). Other RFCs also describe an HMAC-MD5 protected against replay attacks.
The CIPHER MI function http://publib.boulder.ibm.com/iseries/v5r1/ic2924/tstudio/tech_ref/mi/CIPHER.htm supports MD5 and HMAC-MD5 (in addition to the SHA-1 algorithm, among others). The existing MD5 subprocedure is a template for creating an additional HMAC-MD5 subprocedure.
According to the RFC, this is the test suite for HMAC-MD5. When testing, all input is on one line without carriage returns, and is null-terminated.
Key (16 bytes): 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
Data (8 bytes): "Hi There"
Digest: 9294727a3638bb1c13f48ef8158bfc9d
Key (4 bytes): "Jefe"
Data (28 bytes): "what do ya want for nothing?"
Digest: 750c783e6ab0b503eaa86e310a5db738
Key (16 bytes): 0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Data (50 bytes): 0xDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
Digest: 56be34521d144c88dbb8c733f0e8b3f6
This code is BETA! It does not work correctly for all test cases! Do not use it in a production environment!
I have tried this code on V4R5. It should be upwards compatible through V5R2. As usual, my source library is TOOLSSRC, and the program library is TOOLS. Change to suit your needs.
*-------------------------------------------------------------------------
*-- MD5 HMAC hash
*-- This prototype will return an MD5 HMAC hash (see RFC 2104).
*-- To my knowledge, any data that can be passed by pointer (and whose
*-- length fits into a 10i 0 numeric field) can be hashed. This can
*-- include:
*-- > character data (though I would not recommend varying length)
*-- > data structures
*-- > numeric data (data type zoned recommended)
*-- > file record formats
*-- > graphic character data
*-- > objects stored on the IFS (you would probably need to read the
*-- object into an allocated memory space to use correctly)
*-- > anything else that can be addressed by pointer
*--
*-- NOTE: The hash is returned in ASCII characters, not EBCDIC. This is
*-- to facilitate direct comparison between MD5 hashes from other
*-- systems.
*-- NOTE: Requires the following complation options:
*-- DFTACTGRP(*NO) BNDDIR('QC2LE')
*--
*-- ATTRIBUTIONS
*-- The original program from which this source is based can be found at
*-- the website http://www.as400pro.com/TipsRPG7.htm
*--
*-- INPUTS
*-- 1) pointer to data to be hashed (data type pointer)
*-- 2) length of data to be hashed (data type 10i 0)
*-- 3) pointer to HMAC secret key (data type pointer)
*-- 4) length of HMAC secret key (data type 10i 0)
*--
*-- OUTPUT
*-- 1) MD5 HMAC hash fingerprint of input data (data type char 32)
*--
*-- EXAMPLE OF USE
*-- eval datatohash = 'my data to hash'
*-- eval hashkey = 'my key'
*-- eval datahashmd5 = md5hmac( %addr(datatohash) :
*-- %len(%trimr(datatohash)) :
*-- %addr(hashkey) : %len(%trimr(hashkey)) )
*-------------------------------------------------------------------------
d MD5HMAC pr 32a
d inputdata * value
d inputlength 10i 0 const
d inhmacdata * value
d inhmaclength 10i 0 const
*-------------------------------------------------------------------------
*-- MD5 HMAC hash - procedure
*-------------------------------------------------------------------------
p MD5HMAC b
d MD5HMAC pi 32a
d inputdata * value
d inputlength 10i 0 const
d inhmacdata * value
d inhmaclength 10i 0 const
d cipher pr extproc('_CIPHER')
d * value
d * value
d * value
d convert pr extproc('_XLATEB')
d * value
d * value
d 10u 0 value
d cvthc pr extproc('cvthc')
d 1a
d 1a
d 10i 0 value
d controls ds
d function 5i 0 inz(5)
d hashalg 1a inz(x'00')
d sequence 1a inz(x'00')
d datalength 10i 0 inz(15)
d output 1a inz(x'01')
d unused 7a inz(*loval)
d hashctxptr * inz(%addr(hashworkarea))
d hmacctxptr *
d hmaclength 10i 0 inz(15)
d hashworkarea s 160a inz(*allx'00')
d receiverhex s 16a inz
d receiverptr s * inz(%addr(receiverhex))
d receiverchr s 32a
d startmap s 256a inz
d to819 s 256a inz
d ccsid1 s 10i 0 inz(37)
d st1 s 10i 0 inz(0)
d l1 s 10i 0 inz(%size(startmap))
d ccsid2 s 10i 0 inz(819)
d st2 s 10i 0 inz(0)
d gccasn s 10i 0 inz(0)
d l2 s 10i 0 inz(%size(to819))
d l3 s 10i 0 inz
d l4 s 10i 0 inz
d fb s 12a inz
d ds
d x 5i 0
d lowx 2 2
d upper c 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
d lower c 'abcdefghijklmnopqrstuvwxyz'
* Get all single byte ebcdic hex values
c 0 do 255 x
c eval %subst(startmap:x+1:1) = lowx
c enddo
* Get conversion table: ccsid 37 to 819
c call 'QTQCVRT'
c parm ccsid1
c parm st1
c parm startmap
c parm l1
c parm ccsid2
c parm st2
c parm gccasn
c parm l2
c parm to819
c parm l3
c parm l4
c parm fb
c eval datalength = inputlength
c eval hmaclength = inhmaclength
* Change message to ccsid 819 (ascii)
c callp convert(inputdata : %addr(to819) :
c inputlength)
c callp convert(inhmacdata : %addr(to819) :
c inhmaclength)
* Get MD5 HMAC
c eval hashworkarea = *allx'00'
c eval hashctxptr = %addr(hashworkarea)
c eval hmacctxptr = inhmacdata
c callp(e) cipher( %addr(receiverptr) : %addr(controls)
c : %addr(inputdata) )
c if %error
c eval receiverchr = *blanks
c else
* Convert nibbles to characters
c callp cvthc( receiverchr : receiverhex :
c %size(receiverchr) )
c endif
c UPPER:LOWER xlate receiverchr receiverchr
c return receiverchr
p MD5HMAC e
Example use with command MD5H
cmd prompt('MD5 HMAC command line')
parm kwd(text) type(*char) len(512) prompt('Text to hash') vary(*yes *int4) case(*mixed)
parm kwd(key) type(*char) len(512) prompt('HMAC key to use') vary(*yes *int4) case(*mixed)
pgm (&text &key)
dcl var(&text) type(*char) len(516)
dcl var(&key) type(*char) len(516)
dcl var(&md5enc) type(*char) len(32)
call pgm(md5hie) parm(&text &key &md5enc)
sndpgmmsg msgid(cpf9897) msgf(qcpfmsg) msgdta('The MD5 hash is: ' *cat &md5enc)
dmpclpgm
endpgm
h dftactgrp(*no) bnddir('QC2LE')
/copy toolssrc/prototypes,md5hmac
d intext ds 516
d length 10i 0 overlay(intext:1)
d data 512 overlay(intext:5)
d inkey ds 516
d klength 10i 0 overlay(inkey:1)
d kdata 512 overlay(inkey:5)
d mymd5 s 32a
c *entry plist
c parm intext
c parm inkey
c parm mymd5
c eval mymd5 = md5hmac( %addr(data) : length :
c %addr(kdata) : klength )
c eval *inlr = *on
c return
/copy toolssrc/procedures,md5hmac
This code works correctly for test case 1 (data “Hi There”), but not for test cases 2 and 3. According to the IBM documentation on the CIPHER instruction, there is a minimum key size of 16 bytes. As provided, test case 2 fails because the key is too small. However, there is no mandatory key length requirement in the RFC. In any case {“the minimal recommended length for K is L bytes”} (as the hash output length). Given that MD5 output is 16 bytes (although the common usage is to show the bytes in hexadecimal, 32 positions), the recommended key length is 16 bytes. IBM has made this a requirement. I am unsure whether/how IBM’s implementation of HMAC-MD5 in CIPHER passed the test suite.