I wrote a small program, about 10,500 lines and everything works perfectly.
It is a Math game, which has is login and sign-up process and all data is stored on the user’s computer in multiple text files. It has support for multiple users.
I wrote a function which allows the user to search for another user in the database, and if the username is not found the program attempts to find the user name in the file which closely resemble the one that was entered and provide suggestions.
However, this function seems to execute really slowly when there are a lot of usernames in the file.
How I can optimize it?
// Function to generate suggestions
void Suggestions (char passed[])
{
/**
In the case that the administrator makes an error while searching for a user,
tghe program attempts to determine the username in the database which closest
matches the username that wa entered
*/
// Vaiables
char sname[MAX], record[MAX];
int w, x, y, z;
char abs = 'N', f = 'N';
int len = 0, temp = 0;
// Sets them all to 0
w = x = y = z = 0
taken user[MAX];
// Attempts to open file
ptrRead = fopen("UserNames.smq", "r");
if(ptrRead == NULL)
{
printf("nnThe file was not found ...nnn");
}
else
{
// Copies the incorrect username passed to the function
strcpy(sname, passed);
// Encrypts the passed name
for(x = 0; x < strlen(passed); x++)
{
sname[x] += key;
}
// Sets x to 0
x = 0;
// Reads all of file with the usernames
while(!feof(ptrRead))
{
fgets(record, MAX - 1, ptrRead);
// Compares the strings
if(strcmp(record, sname) == 0)
{
// Sets found to yes
f = 'Y';
// Breaks our of loop
break;
}
// End if
}
// End while
if(f == 'N')
{
// Rewinds the file pointer
rewind(ptrRead);
// Reads the data of the file and calculate various contents
while(!feof(ptrRead))
{
// Sets temp to 0
temp = 0;
// Gets the username
fgets(record, MAX - 1, ptrRead);
// Copes the information
strcpy(user[x].userName, record);
// Calculate the length
len = strlen(user[x].userName) - strlen(sname);
// Determines if the name is valid for testing
if(len == -2 || len == -1 || len == 0 || len == 1 || len == 2)
{
/**
This if statement limits the number of work trhe program does by eliminating
the usernames with a difference in string length grater than 2 or less than - 2
*/
// Counts the number of letters common in a sequence
for(w = 0; w < strlen(user[x].userName); w++)
{
/**
It does not matter which string length the for-loops runs for because
when the average is calculated, the same result appears both ways
*/
// Compares both letters
if(user[x].userName[w] == sname[w])
{
/**
If both letters at the current index are the same, then temp will be
increased by the value 1. If not, then temo will not increase. This gets
the number of letter the two names have in common in a sequence.
*/
temp++;
}
// End if
}
/* End for */
// Calculates the average letters in common
user[x].lettersInCommom = (float) temp / ((strlen(user[x].userName) + strlen(sname)) / 2) * 100.0;
// Sets temp to 0
temp = 0;
// Calculates the number of letters the have in common
for(y = 0; y < strlen(sname); y++)
{
/*
Runs the loop until the string length of the incorrect username - Just for fun
*/
for(w = 0; w < strlen(user[x].userName) ; w++)
{
/**
This for loop runs for the entire string lenght of the username in the file
*/
// Compares the letters
if(user[x].userName[w] == sname[y])
{
/**
Compares the characters at the current index. If they are the same,
then temp increase by 1. The purpose of these nested loops is to
check the number of letters the two name have in common in general,
and not just in a sequence.
*/
temp++; // Increments temp
}
}
}
// Calculates the average letters in common in general
if(strlen(user[x].userName) > strlen(sname))
{
user[x].lettersInGeneral = (float) temp / (float) strlen(sname) * 100.0;
}
else
{
user[x].lettersInGeneral = (float) temp / (float) strlen(user[x].userName) * 100.0;
}
// Sets temp to 0
temp = 0;
// Checks if both names begins with the same letter
if(sname[0] == user[x].userName[0])
{
user[x].beginsWithFirst = 100.0; // Assign 100
}
else
{
user[x].beginsWithFirst = 0.0; // Assign 0
}
// Checks if the names ends with the same letter
if(sname[strlen(sname) - 1] == user[x].userName[strlen(user[x].userName) -1])
{
user[x].endsWithLast = 100.0; // Assign 100
}
else
{
user[x].endsWithLast = 0.0; // Assign 0
}
}
/**
Calculates the rank. The rank is an overall percentage that represents the average similarity
that the username in the file has with the incorrect username entered. Similar usernames
normally have a rank of over 70%
*/
user[x].rank = (user[x].lettersInCommom + user[x].lettersInGeneral + user[x].beginsWithFirst + user[x].endsWithLast) / 4.0;
// Increments x
x++;
}
// Closes the file
fclose(ptrRead);
// Sets temp to 0
temp = 0;
for(z = 0; z < x; z++)
{
/*
Runs until the number of usernames in the file is reached
*/
if(user[z].rank > 70.00 && user[z].rank < 200.00)
{
// Increments temp by 1 if conditon is met
temp++;
}
}
if(temp > 0)
{
temp = 0; // Resets temp
/*
This is done to determine whether or not to print the contents in this if statement.
If no suggestion were found then printing "Did you mean" would be a waste of processing.
*/
// Clears the screen
system("cls");
printf("nttttOoops!");
printf("nttt*********************nn");
printf("nLooks like you made an error in your search. ");
printf("Did you mean: nn");
// For all the usernames
for(z = 0; z < x; z++)
{
// Checks the rank
if(user[z].rank > 70.00 && user[z].rank < 200.00)
{
/* Increments temp */
temp++;
for(w = 0; w < strlen(user[z].userName); w++)
{
// Decrypts the name
user[z].userName[w] -= key);
}
if(key > 2)
{
/**
This removes a weird looking character that always shows at the end
of the username when the key is greater than 2.
*/
user[z].userName[strlen(user[z].userName) - 1] = '';
}
// Prints decrypted user name as is
printf("n(%d)t%s", temp, user[z].userName);
}
}
// Allows the admin to search again
ViewSingleUser();
}
else
{
// Prints error
printf("nnWe were unable to detect any possible matches ...");
}
}
}
}
// Function ends here
5
general hints
I would consider (hoping that you are using Linux and GCC, which is an excellent developer environment):
- read a good Introduction to Algorithms (that is, spend at least a week reading it); be aware of good container algorithms like hash tables and self-balancing binary search trees. BTW, if you coded in C++11 or C++14 (not C99 or C11) or in Ocaml, its standard library provides them (e.g. C++ standard containers like
std::map
orstd::unordered_set
, etc.. or Ocaml standard library modules like Set or Hashtbl, etc…). Have a good estimate of time complexity of your functions. - enable all warnings and debug info in your code (so compile with
gcc -Wall -Wextra -g
); improve your code till you get no warnings - test your program and have some test suite
- use valgrind to hunt memory leaks
- profile your code, e.g. by compiling it with
gcc -pg -Wall -g
then using gprof; find out the functions consuming most of the CPU time and focus on improving and rewriting them - consider perhaps using some relational database, e.g. read some SQL tutorial and use sqlite.
- consider using some Levenshtein distance to compare similarity of strings
- study the source code of free software projects similar to yours.
about your code
(we don’t know how your Suggestion
function is called and what it should really do)
Notice that your nested loops look suspicious.
Use perror
or strerror(errno)
on failure of fopen
so code:
ptrRead = fopen("UserNames.smq", "r");
if(ptrRead == NULL) {
perror("UserNames.smq");
return; // or perhaps exit(EXIT_FAILURE);
}
The f
variable should be a bool f = false;
since it is a boolean flag.
You don’t modify sname
after its initialization from passed
, so why don’t you declare void Suggestions (const char* passed)
and use passed
everywhere instead of sname
? No need to copy the unmodified passed
string to sname
(which you don’t modify neither)…
You don’t handle or avoid buffer overflows. In particular, if the sname
happens to be a very long string (wider than your MAX
…) you have the dreaded undefined behavior
BTW, loops like for(y = 0; y < strlen(sname); y++)
are really inefficient. You are computing the string length at every iteration; better memoize it and compute the string length once:
size_t snlen = strlen(sname);
for (y=0; y<snlen; y++)
(some compilers might be able to do such an optimization, but you should not take it as granted)
Also, use more systematically pointers to char
, e.g. replace
for(w = 0; w < strlen(user[z].userName); w++)
{
// Decrypts the name
user[z].userName[w] -= key);
}
with
for (char*pc = user[z].userName; *pc; pc++)
*pc -= key;
Such a loop is shorter to write and use character pointers, don’t compute the strlen
at every iteration, and use the fact that strings are terminated by a zero byte.
Of course, be aware that your encryption is really poor; if you need some, use an existing encryption library or function (see crypt(3) and TLS …)
0