The QBasic Forum      Other Subforums, Links and Downloads
 
  << Previous Topic | Next Topic >>Return to Index  

MD5 Cryptographic Hash Function

November 2 2009 at 1:52 PM
Laanan Fisher  (no login)

I posted this QuickBASIC/QBasic MD5 implementation a few days ago at forum.qbasicnews.com. I wanted to post it here also, but was afraid the forum formatting would make the code unreadable. If it's OK, here's a link to the original thread:

http://forum.qbasicnews.com/index.php?topic=13371.0

 
 Respond to this message   
AuthorReply
qbguy
(no login)

Artelius wrote one too

November 2 2009, 2:59 PM 

http://www.network54.com/Forum/178387/message/1231538625/MD5+function

It has a nice hack to bypass integer overflow.

Here is the RFC for MD5 in case you want to compete for the title of fastest QBASIC MD5 function:
http://www.faqs.org/rfcs/rfc1321.html

 
 Respond to this message   
Laanan Fisher
(no login)

Re: Artelius wrote one too

November 3 2009, 5:59 AM 

Thanks for the link, qbguy ! There's some nice time- and space- saving ideas in there. I don't think I could compete with speed, but perhaps incremental hashing could be adapted.

 
 Respond to this message   

(Login Mikrondel)
R

And unlike yours it works in QB7.1 too :P

November 3 2009, 3:17 PM 

Using VARPTR without VARSEG is dangerous. And using either of them on strings is not portable to QB7.1, in which you need to use SADD and SSEG instead.

Pointers are ugly in DOS, and QBasic isn't a pointer-friendly language. Avoid them if you can.

*Ceteris paribus*, the "unrolled" implementation that you used might actually be faster. But I think my speed hacks make a huge difference.

If you want me to explain any magic in my code, just say the word...

 
 Respond to this message   
Laanan Fisher
(no login)

Re: And unlike yours it works in QB7.1 too :P

November 3 2009, 6:40 PM 

Yes, I tried to minimize the use of variable-length strings, and limit high-level modification of them to "in-place" operations only (no reallocations). As you know that may not be enough to prevent QB from invalidating pointers, and of course there's nothing to stop a user from extensive variable-length string use.. I don't have PDS 7.1, so I can't really design for it.

My original rotation procedure used a loop similar to yours, but contained many ^'s. I spent a little time trying to get the addition to behave like yours as well, but after a short time I gave up. Now that I see a working implementation it does make sense. This is probably my biggest bottleneck, and I can see huge advantages to at least inlining the round operations and folding ^ constant operands where possible. Thanks for your comments.

 
 Respond to this message   

(Login Mikrondel)
R

Invalidating pointers isn't the issue

November 4 2009, 12:50 AM 

The issue is that using VARPTR() on a string is in fact using undocumented behaviour, behaviour which was in fact changed between QB4.5 and QB7.1.

In QB4.5, VARPTR() gets you the address of the actual string data whereas in QB7.1 it gets the address of a string descriptor (within which can be found a pointer to the actual string data). This change was done to enable strings to live in far memory, thus increasing the amount of memory available to the program.

This affects both fixed-length and variable-length strings.

Even if you take this into account (by using SADD instead of VARPTR), the strings CAN be in far memory which means you need to mind both their segment and offset when working with them. Although I have a feeling that locally-declared fixed-length strings are still allocated on the stack and hence will be in DGROUP.

 
 Respond to this message   
Anonymous
(no login)

Re: Invalidating pointers isn't the issue

November 4 2009, 6:00 AM 

Well, the code was designed to run in QBasic 1.1, though I still tested it with QuickBASIC 4.5 (interpreted and compiled), both of which seem to return the descriptor of variable-length strings using VarPtr (including string parameters of course, see MD5GetStringDataPtr%). As far as I can tell any non-static fixed-length string character data is indeed stored on the stack. It's my understanding that both will also automatically reorganize variable-length string character data as memory fragmentation increases over time, though I do not know specific details.

From Ethan Winer's BASIC Techniques and Utilities:
"In QuickBASIC programs and BASIC PDS when far strings are not specified, all strings are stored in an area of memory called the *near heap*. The string data in this memory area is frequently shuffled around, as new strings are assigned and old ones are abandoned."

I do not know the breadth of the behavior that is undocumented in QBasic/QuickBASIC, but their documentation does leave a bit to be desired in any case. As I don't have PDS 7.1, I can't, and didn't/don't, guarantee anything about its stability on that platform (as I could not guarantee its stability in Python or Haskell; I doubt it wouldn't even compile.. ;D), but one may feel free to modify the code to support that and other systems as they see fit.

 
 Respond to this message   

(Login Mikrondel)
R

Whoops

November 6 2009, 9:54 PM 

My memory failed me for a bit there. Yes, all versions of QuickBasic and QBasic use string descriptors for variable-length strings, and fixed-length strings have no descriptors.

The "undocumented behaviour" is the contents of these descriptors. In QBasic, QB 4.5, and QB 7.1 compiled with near strings, these are all the same.

In QB 7.1, compiled with far strings, OR interpreted, the format of the string descriptors are different. The gory details are in Winer Chapter 2.

For the record, I tried compiling your program in 7.1 with near strings. Didn't crash.

My preferred solution is to steer clear of strings. Generally by using arrays. You *can* portably use fixed-length strings if you wrap them up in a TYPE (as suggested by Winer in Chapter 12) or be careful to ONLY pass their address, never the actual string, as an argument to a procedure.

>It's my understanding that both will also automatically reorganize variable-length string character data as memory fragmentation increases over time, though I do not know specific details.

Only if you do anything that requires the creation of a string (that is, string assignment, concatenation, or using functions that return strings like CHR$() or MID$() or MD5IReturnAString$; ASC() and LEN() are fine). If none of these things happens in between getting the address of string data and using the address, then there's nothing to worry about.

 
 Respond to this message   
Laanan Fisher
(no login)

Re: Whoops

November 6 2009, 11:03 PM 

>If none of these things happens in between getting the address of string data and using the address, then there's nothing to worry about.

Yes, this is what I assumed would be the case. Though, in my experimentation, seemingly innocuous statements like "print 420" may also cause the garbage collector to run.

>In QB 7.1, compiled with far strings, OR interpreted, the format of the string descriptors are different.

I guess my best recourse is to simply state for which tool (and compiler/interpreter settings) the code is designed to be used with then.

>For the record, I tried compiling your program in 7.1 with near strings. Didn't crash.

Well, that's half the battle right there ! Wonder if it output correct digests.. ? Thanks for trying it out, anyway.

 
 Respond to this message   

(Login Mikrondel)
R

*Haha, an implicit STR$(), gotta watch out for that one...

November 6 2009, 11:45 PM 


 
 Respond to this message   
Current Topic - MD5 Cryptographic Hash Function
  << Previous Topic | Next Topic >>Return to Index  

Newbies usually go to www.qbasic.com and click on The QBasic Forum
Forum regulars have their own ways, which include The QBasic Community Forums