Better getopt_long.

Written

I do not like getopt_long. Dividing short options and long options to two parameters in function but still, short value has to put in struct option. It is a waste of memory. Function family getopt* does have also problem of using global variables to return information (optarg,optind,opterr,optopt). The problem here is that this is also a waste of memory. Most of the time program arguments processed once and then moves on. Global variables are left hanging in memory not being used. Now it may be that since the getopt* family is part of glibc these globals are loaded by default anyway. This could probably fixed by compiling glibc differently. That is at least a reason to use getopt.

Of course, you shouldn't just go and write your own code because you don't like something. Someone else probably as well doesn't like getopt_long and has made an alternative. So before I did decide to write my own code I did try to find an alternative. I remember not finding anything since I end up writing my own. I don't remember what was my reasoning at the time but I could redo the search now and explain why I don't like what I find.

  • optlist| There is no need to make a linked list. Library is only for short options.
  • optparse| Init function is not needed.
  • argtable3| Interesting ideas about more general CLI handling. Structures are unoptimized. Constructor functions allocated memory.
  • commander| Has init function which I don't see the point of.
  • yuck| Yuck file does solve documentation problems but I rather have C code.
  • dropt| Is there a need for context? Error handler, really?
  • xopt| Do we need context creation? Why simple create options (array, hash, or tree) and run function. Type idea has some merit.
  • saneopt| Memory allocations.
  • line-arg| Memory allocations.
  • get_options| Documentation is not informative enough to me figure out what it does.

So I decided to do my own. I called it OpHand and I designed it for long options. I probably took on xopt's type idea (or similar) used it to lessen some coding. The problem with this is that I can't use a switch to have fast handling. Hash mapping could make finding the correct option faster, but there is the problem of initialization. Code is provided below.

Options are stored in the Option structure. This structure is first optimized by having pointers first because 64 bit addressing on modern computers. This avoids some padding. After that, we going with a bit size to avoid more padding. Member longoption like the name suggests the long version of an option and then the option is the short version. The variable is ether pointer to variable that is changed by the option, callback function, or string to be printed. Member value is used if an argument is not needed. Member flags is a bit flag field for the option.

typedef struct Option{
	char *longoption;
	union{
		int32_t *p32;
		char **str;
		OptFunction func;
		const char *printstr;
	}variable;
	union{
		int32_t *p32;
		void *coderdata;
		int32_t v32;
		char *str;
	}value;
	char option;
	OptionFlag flags;
}Option;

OptionFlag bit field has booleans for does option need argument or not, should the processing be stopped or not option is hit, and type for the operation. Type is sized to be the rest of the bits to get the nearest power of 2. OPHAND_CMD_PRINT prints .value.v32 amount of characters pointed by .variable.printstr. OPHAND_CMD_VALUE sets .variable.p32 ether to argument or to .value.v32. OPHAND_CMD_POINTER_VALUE sets .variable.str to argument or to .value.str. Coder can use .value.p32 to set pointer to 32 bit integer. OPHAND_CMD_FUNCTION calls a callback function .variable.func which coder can supply data by .value.coderdata. The callback has parameters char option, void *restrict coderdata, const char *restrict arg. The first one is a short option. The second one is obviously the coder's data, and last one is the argument from the command line which is null if the option doesn't have an argument. The callback must return true if ophand processing continues and false indicating error. OPHAND_CMD_OR, and OPHAND_CMD_AND can be used operate 32 bit field pointed by .variable.p32 by value given at .value.v32. It does not have an argument version currently. Internally two functions handled the decision on type switchNonArgument and switchArgument. Like the names suggest, argument options and non-argument options are handle in separate switches. Separate functions are used for better flow in opHand.

typedef struct OptionFlag{
	bool argument : 1;
	bool stop : 1;
	enum{
		OPHAND_CMD_VALUE,
		OPHAND_CMD_POINTER_VALUE,
		OPHAND_CMD_FUNCTION,
		OPHAND_CMD_OR,
		OPHAND_CMD_AND,
		OPHAND_CMD_PRINT,
	}type : 6;
}OptionFlag;

I do like the flow I made. Small feature addition could be done, but the overall structure I like. The first decision is between two or one dashes. If an option's letters are given search for the Option is started. Search is done by for loop. If an Option isn't found search loop exits "naturally". If an Option is found execution ends ups returning with an error (argument not given for an example) or uses goto to "continue" outer args loop.

args is reused to store non-options for user processing. This could be done by providing a different parameter. The C standard only states that argv[i] are modifiable strings. However, it would be completely crazy if the argv pointer couldn't be modified. OS can't trust program wouldn't have intentional memory leak behaviour because of it. Also, OS has to copy strings to memory them to be modifiable so memory addresses are different from what they were at exec* call. Hence, a new array of pointers is needed.

Args index technically (by lack of clear check mandate) can be null pointer by UNIX standard (see exec* family), so user input guard should be in place. The default behaviour on this, I decided, was to jump over.

Example code below. opHand returns ether that processing is done, processing was stoped, or one of two errors.

#include<stdint.h>
#include<unistd.h>
#include<stdio.h>
#include"OpHand.h"

bool sillyCallback(char option,void *restrict coderdata,const char *restrict arg){
	char *text=(char *)coderdata;
	printf("Silly: %s %s\n",text,arg);
	return true;
}

int main(int argn,char *args[]){
	
	const char usage[]="--help,-h\t Show usage.\n"
	                   "--value,-k\t Grap a value.\n"
	                   "--greeting,-g\t Grap a string.\n"
	                   "--flagand,-a\t Do and flag.\n"
	                   "--flagor,-o\t Do or flag.\n"
	                   "--silly,-s\t Silly printing with argument.\n\n";
	
	int32_t clinumber=0;
	char *greeting=NULL;
	typedef union Flags{
		struct{
			bool and:1;
			bool or:1;
			bool padding:1;
		}f;
		int32_t shadow;
	}Flags;
	const Flags AndFlag={.f.and=1,.f.or=1,.f.padding=1};
	const Flags OrFlag={.f.and=1,.f.or=1,.f.padding=0};
	Flags flags={.f.and=1,.f.or=0,.f.padding=0};
	Option options[]={{"help",.variable.printstr=usage,.value.v32=sizeof(usage)-1,'h',{false,true,OPHAND_CMD_PRINT}}
	                 ,{"value",.variable.p32=&clinumber,.value.v32=0,'k',{true,false,OPHAND_CMD_VALUE}}
	                 ,{"greeting",.variable.str=&greeting,.value.v32=0,'g',{true,false,OPHAND_CMD_POINTER_VALUE}}
	                 ,{"flagand",.variable.p32=&flags.shadow,.value.v32=AndFlag.shadow,'a',{false,false,OPHAND_CMD_AND}}
	                 ,{"flagor",.variable.p32=&flags.shadow,.value.v32=OrFlag.shadow,'o',{false,false,OPHAND_CMD_OR}}
	                 ,{"silly",.variable.func=sillyCallback,.value.coderdata=(void*)"This is silly.",'s',{true,false,OPHAND_CMD_FUNCTION}}
	};
	
	switch(opHand(argn-1,args+1,options,sizeof(options)/sizeof(Option))){
		case OPHAND_PROCESSING_STOPPED:
			(void)write(STDOUT_FILENO,"Early stop!\n",12);
		case OPHAND_PROCESSING_DONE:
			break;
		case OPHAND_UNKNOW_OPTION:
			(void)write(STDERR_FILENO,"Unknown option!\n",16);
			return 1;
		case OPHAND_NO_ARGUMENT:
			(void)write(STDERR_FILENO,"No argument!\n",13);
			return 1;
	}
	printf("Doing stuff!\nclinumber: %d\ngreeting: %s\nflags: %X\n",clinumber,greeting,flags.shadow);
	for(char **ite=args;*ite;ite++) puts(*ite);
	
	return 0;
}      

I could add features to the module:

  • Command for getsubopt processing.
  • Command for floating-point numbers.
  • After -- -options should be put to args list. Guideline 10
  • Condensable short options. Guideline 14
  • Optional arguments.
  • Equal size to mark argument in long options.
  • Better handling for callback error as it is indicated as processing stop.
  • Hash update since it would be faster than repeated for loop processing. The problem is that to be efficient, the hash map must be in read only memory. Run time creation would take too long considering the user most likely puts only a few options per call. This would need macro usage at its limit or compiler to C.
  • ophand_short.
  • GIT style command handling function which allows options like --help.
  • Automatic help and version handling.

BTW: I didn't check my grammar on comments.

Full Module

OpHand.h

/********************************************************************
* This module is general functions for CLI handling.                *
********************************************************************/
#ifndef _OP_HAND_H_
#define _OP_HAND_H_

#include<stdint.h>
#include<stdbool.h>

/*********************************************************
* Macro for checking does given option flag ask for      *
* argument.                                              *
*********************************************************/
#define HAS_ARGUMENT(A) A.argument

/*********************************************************
* Return value of the ophand.                            *
* Designed so that zero or one is returned on SUCCESS.   *
* OPHAND_PROCESSING_STOPPED differs from                 *
* OPHAND_PROCESSING_DONE by indicating stop bit was      *
* used to stop the processing.                           *
*********************************************************/
typedef enum OpHandReturn{
	OPHAND_PROCESSING_DONE=0,
	OPHAND_PROCESSING_STOPPED=1,
	OPHAND_UNKNOW_OPTION=2,
	OPHAND_NO_ARGUMENT=3,
}OpHandReturn;
/*********************************************************
* Macros for the option flags.                           *
*                                                        *
* Members:                                               *
*   argument bit tells does option have argument.        *
*   stop     bit tells should parsing to be stoped after *
*            this option.                                *
*   type     bits tell what operation is performed.      *
*            - OPHAND_VALUE set 32 bit integer to        *
*              constant given or to argument.            *
*            - OPHAND_POINTER_VALUE sets a pointer to    *
*              constant given or points to argument as a *
*              string.                                   *
*            - OPHAND_FUNCTION calls given function to   *
*              handle the option.                        *
*            - OPHAND_OR performs OR operation to 32 bit *
*              variable. Doesn't support argument        *
*              option.                                   *
*            - OPHAND_AND performs AND operation to 32   *
*              bit variableDoesn't support argument      *
*              option.                                   *
*            - OPHAND_PRINT prints constant message.     *
*              Doesn't support argument option.          *
*********************************************************/
typedef struct OptionFlag{
	bool argument : 1;
	bool stop : 1;
	enum{
		OPHAND_CMD_VALUE,
		OPHAND_CMD_POINTER_VALUE,
		OPHAND_CMD_FUNCTION,
		OPHAND_CMD_OR,
		OPHAND_CMD_AND,
		OPHAND_CMD_PRINT,
	}type : 6;
}OptionFlag;
/*********************************************************
* Type for the function call if argument is hit.         *
* Programmer should send true if OptFunction doesn't     *
* cause error and false if error happened so that opHand *
* can stop.                                              *
*********************************************************/
typedef bool (*OptFunction)(char option,void *restrict coderdata,const char *restrict arg);
/*********************************************************
* Structure declaring option for ophand function.        *
*********************************************************/
typedef struct Option{
	char *longoption;
	union{
		int32_t *p32;
		char **str;
		OptFunction func;
		const char *printstr;
	}variable;
	union{
		int32_t *p32;
		void *coderdata;
		int32_t v32;
		char *str;
	}value;
	char option;
	OptionFlag flags;
}Option;

/*********************************************************
* Function performs the option handling. Returns 1 if    *
* everything went fine. Zero if error occured.           *
* Non-options arguments are put to args (will override!) *
* and null ending tells the end. If "--" is encountered  *
* then opHands execution returns to caller.              *
*                                                        *
* NOTE I: optionslen isn't' sanity checked so better     *
* put right size in.                                     *
* NOTE II: args or options aren't null check so segment  *
* faults are on you!                                     *
* NOTE III: if args element is null pointer it is just   *
* ignored.                                               *
* NOTE IV: Do not provide type to argument options which * 
*          does not supported that type!                 *
*********************************************************/
OpHandReturn opHand(int argn,char *restrict *restrict args,const Option *restrict options,uint32_t optionslen);
#endif /* _OP_HAND_H_ */
    

OpHand.c

/********************************************************************
* This module is general functions for CLI handling.                *
********************************************************************/
#include<string.h>
#include<stdlib.h>
#include<unistd.h>
#include<stdbool.h>
#include"OpHand.h"

	/*********************************
	* Handles non-argument options.  *
	*********************************/
	static bool switchNonArgument(const Option *option){
		switch(option->flags.type){
			case OPHAND_CMD_VALUE:
				*option->variable.p32=option->value.v32;
				break;
			case OPHAND_CMD_OR:
				*option->variable.p32|=option->value.v32;
				break;
			case OPHAND_CMD_AND:
				*option->variable.p32&=option->value.v32;
				break;
			case OPHAND_CMD_POINTER_VALUE:
				*option->variable.str=option->value.str;
				break;
			case OPHAND_CMD_PRINT:
				(void)write(STDOUT_FILENO,(char*)option->variable.str,option->value.v32);
				break;
			case OPHAND_CMD_FUNCTION:
				// Return whatever callback returns
				return option->variable.func(option->option,option->value.coderdata,0);
		}
		return !option->flags.stop;
	}
	/*********************************
	* Handles argument options.      *
	*********************************/
	static bool switchArgument(char *arg,const Option *option){
		#pragma GCC diagnostic push
		#pragma GCC diagnostic ignored "-Wswitch"
		switch(option->flags.type){
			case OPHAND_CMD_VALUE:
				*option->variable.p32=atoi(arg);
				break;
			case OPHAND_CMD_POINTER_VALUE:
				*option->variable.str=arg;
				break;
			case OPHAND_CMD_FUNCTION:
				// Return whatever callback returns
				return option->variable.func(option->option,option->value.coderdata,arg);
		}
		#pragma GCC diagnostic pop
		return !option->flags.stop;
	}
	/***************
	* See OpHand.h *
	***************/
	OpHandReturn opHand(int argn,char *restrict *args,const Option *restrict options,uint32_t optionslen){

		// Points to next location where loop would
		// put non-option arguments. Used after all
		// arguments are processed to mark end of
		// non-option arguments.
		uint32_t nonoptpoint=0;

		// This stores what option is hit after long or short option
		// is hit.
		uint32_t foundoption;

		// Simple start going through arguments and compare to options.
		// Remember that first argument is the location of the execution.
		for(uint32_t arg=0;arg<argn;arg++){

			// Check that argument wasn't null pointer since POSIX standard does
			// not define check for it at exec* function family.
			// Behavior for this is continue processing.
			if(args[arg]==0) continue;

			// Actual processing
			if(args[arg][0]=='-'){
				if(args[arg][1]=='-'){
					if(args[arg][2]!='\0'){
						for(foundoption=0;foundoption<optionslen;foundoption++){
							if(options[foundoption].longoption && strcmp(args[arg]+2,options[foundoption].longoption)==0){
								// If argument is needed we know that it is
								// next string in args array.
								if(HAS_ARGUMENT(options[foundoption].flags)){
									// Check that next argument exist.
									if(++arg<argn){
										bool result=switchArgument(args[arg],options+foundoption);
										if(result) goto jmp_OUTER_LOOP_CONTINUE;
										else return OPHAND_PROCESSING_STOPPED;
									}
									return OPHAND_NO_ARGUMENT;
								}
								else{
									bool result=switchNonArgument(options+foundoption);
									if(result) goto jmp_OUTER_LOOP_CONTINUE;
									else return OPHAND_PROCESSING_STOPPED;
								}
							}
						}
						return OPHAND_UNKNOW_OPTION;
					}
					else{
						// Since two lines where given execution ends here.
						break;
					}
				}
				else{
					if(args[arg][1]!='\0'){
						for(uint32_t foundoption=0;foundoption<optionslen;foundoption++){
							if(args[arg][1]==options[foundoption].option){
								// If argument is needed we have to check
								// is argument start second offset or is it next argument.

								if(HAS_ARGUMENT(options[foundoption].flags)){
									if(args[arg][2]!='\0'){
											bool result=switchArgument(args[arg]+2,options+foundoption);
											if(result) goto jmp_OUTER_LOOP_CONTINUE;
											else return OPHAND_PROCESSING_STOPPED;
									}
									else{
										// Check that next argument exist.
										if(++arg<argn){
											bool result=switchArgument(args[arg],options+foundoption);
											if(result) goto jmp_OUTER_LOOP_CONTINUE;
											else return OPHAND_PROCESSING_STOPPED;
										}
									}
									return OPHAND_NO_ARGUMENT;
								}
								else{
									if(args[arg][2]=='\0'){
										bool result=switchNonArgument(options+foundoption);
										if(result) goto jmp_OUTER_LOOP_CONTINUE;
										else return OPHAND_PROCESSING_STOPPED;
									}
									else return OPHAND_UNKNOW_OPTION;
								}
							}
						}
					}
					return OPHAND_UNKNOW_OPTION;
				}
			}
			else args[nonoptpoint++]=args[arg];
			jmp_OUTER_LOOP_CONTINUE:;
		}

		// Mark the ending as description wanted!
		args[nonoptpoint]=NULL;

		return OPHAND_PROCESSING_DONE;
	}