Popular programming problems on Strings Data Type

Hello and welcome all to a wonderful series of competitive programming to ace coding interviews and crack high-profile software development jobs. We are moving slowly and I hope you are also maintaining a regular consistency of solving a minimum of 2 questions daily on the topic we have covered till now including Array data structure, sorting categories, Recursion, Backtracking, Bit manipulations. In this article, we will solve some basic and medium-level coding problems on a string to make our base strong to cover more advanced algorithms that are yet to come into our path.

Top coding problems on string for interviews


Introduction to a strings

A string is a data type in programming that is defined as a collection of sequences of characters. It is implemented as an array of bytes that stores a sequence of elements as each element corresponds to each character. The common operations that we can perform on staring are finding length, concatenating, copying, replacing, and counting the occurrence of character, and this operation can easily be performed with the help of in-built functions provided by almost all programming languages.

👉 Some programming language supports mutable string type and some do not. More specific to python string is an immutable standard data type so we can assign a new string to exist but cannot modify it.

1) Convert String Case problem

A very simple problem statement in a string to get started. Given a string of alphabetical words and your task is to convert the first character of each word in upper case.

Example

First-line contains the number of test cases and each test case has only one line string input.


Approach using in-built Library Function

The string contains different words separated by space and our task is to make the first character of each word capital so we can split the string by space and get an array of strings. After that, while iterating in an array, we can convert the first character of each word into uppercase and join the remaining characters.

This approach is mostly followed by students more familiar with python and will write this program only. we also have such functions in different languages too. But remember while iterating you cannot modify the existing word, you have to make and assign a new modified string in that place.

def convertString(str):
    ans = str.split()
    for i in range(len(ans)):
        ans[i] = ans[i][0].upper()+ans[i][1:]

    ans = " ".join(ans)
    return ans

Approach to convert string case without using library functions

The idea is to simply traverse the string and check for the lowercase character just after the space and if it is found then change it to uppercase using ASCII value.

How to convert any character from lowercase to uppercase using ASCII value?

To find the respective character ASCII value in uppercase subtract the ASCII value of lowercase character 'a' from the existing ASCII value of the existing character and add the ASCII value of uppercase character 'A' to it and convert it to a character.


Code ~ I hope that approach is pretty clear to you. So try to implement the code.

def convertString(str):
    lst = list(str) #array of each character in str
    
    #if first char is lowercase
    if lst[0] >= 'a':
        lst[0] = chr((ord(lst[0]) - ord('a')) + ord('A'))
        
    for i in range(1, len(lst)):
        #if we have space before it and in lowercase
        if lst[i-1] == ' ' and lst[i] >= 'a':
            lst[i] = chr((ord(lst[i]) - ord('a')) + ord('A'))
            
    str = "".join(lst)
    return str
    

s = "I am a student of the third year"
print(convertString(s))

2) Encode the Message

The problem statement is easy but implies a little bit of thinking. Given a text message, your task is to return a run-length encoding of the input message.

👉 Run-length encoding is a simple and fast method to denote a string with successive characters as a single character and denote with its count in string. observe the below example for a better understanding.

Example

Sample input and output are simple. Given several test cases and each test case contains a single line input as a string. You have to only implement logic in a given function.

How to find run-length encoding of a string (Approach)

This is a constructive approach that we have followed in the above problem statement. Traverse the string and keep track of the previous character and till its match is found increase the counter when another character is found store the previous character and its count in a new string.

  • Create a new empty string as result.
  • Initialize the previous character with the first character of the string and counter as 1.
  • Traverse in a string from 1 till n-1.
  • Until another character is found then the previous character increases the counter. else store the previous character and its count in a new string and update the value of the counter and previous character.

💻Code ~ After explaining the complete algorithm you should try to implement the code without any help from the below snippet.

def EncodeStr(message):
    n = len(message)
    result = ""
    count = 1
    prev_char = message[0]
    
    for i in range(1, n):
        if message[i] == prev_char:
            count += 1
            
        else:
            result = result + prev_char + str(count)
            #update the count and prev_char
            count = 1
            prev_char = message[i]
            
    #the last character will remain unadded in res so add it
    result = result + prev_char + str(count)
    return result

message = "aabbc"
print(EncodeStr(message))

3) Minimum number of parenthesis required to make string pattern valid

The question is very much important to understand because it is mostly asked in technical interviews as well as in different coding contests. The question comes in different forms like checking whether the given string pattern of parenthesis is valid or not.

So you are given a string pattern of only two types of characters as '(' and ')' and your task is to find a minimum number of parenthesis required to add to the string to make the pattern valid.

Example

Given several test cases and for each test case, you have a single string pattern of parenthesis.


How to find minimum parenthesis to make a pattern valid?

If you know the stack data structure then you might know that parenthesis matching is one of the top applications of stack. If you don't know stack then do not panic just read the below paragraph and continue to the approach.

  • Stack is a linear data structure in which the insertion and deletion of elements are done from a single end (It has only one open mouth to insert and delete the items). 
  • It works in a LIFO (last in first out) order. 
  • For example, you have a box and need to arrange books in it so you enter the first book and then another book that will take place above the previous inserted book. If you want to take out the first inserted book then first you need to take out all the previous books inserted above it. This only means that the book which is inserted at last is on top and will come out first.

I hope that what is stack data structure is clear to you. So we will create a stack. And iterate in a given string and whenever we found an opening bracket we will add it to a stack as we encounter the closing bracket we will check whether the top element of the stack is an opening bracket. If it has we will remove that, else we will add the closing bracket to the stack. In the last if the string pattern is valid then the stack will be empty else it will contain the minimum required parenthesis to make the pattern valid.

💻 Code ~ I hope that the approach is clear and we can try to implement it in python. If the approach is not clear then take a pen and paper and just dry-run the sample input with the code and it will clear to you.

def minimumParentheses(pattern):
    ans = 0
    stack = []
    for i in pattern:
        if i == "(":
            stack.append(i)
            
        elif i == ")":
            if len(stack) > 0 and stack[-1] == '(':
                #remove the opening bracket from stack
                stack.pop() 
                
            else:
                #add the closing bracket in stack
                stack.append(i) 
    
    #length of stack is minimum number of required parenthesis            
    ans = len(stack)
    return ans
    
pattern = ")((()"
print(minimumParentheses(pattern))

4) Longest substring with K Distinct Characters

One of the interesting problem statements. Given a string of lowercase alphabets and an integer K, your task is to find the longest substring with at most k distinct characters.

Example

The first line of input contains several test cases. For each test case, you have 2 lines of input in which the first line contains the integer K and the second line contains the input string.


How to find the longest substring of K distinct characters.

Naive Approach

This is a little bit tricky question. The first is a simple Brute-force approach where you use two loops to generate every possible substring and in a second loop, you add one by one character to the current substring and check whether the number of distinct characters is less than k or not and update the maximum length. This solution takes O(N^2) complexity. You can try this approach.

def check(substr, k):
    st = set(substr)
    if len(st) <= k:
        return True
        
    else:
        return False

def getLengthofLongestSubstring(s, k):
    n = len(s)
    max_length = 0
    
    for i in range(n):
        curr_substr = ""
        for j in range(i, n):
            curr_substr = curr_substr + s[j]
            
            if check(curr_substr, k):
                max_length = max(max_length, j-i+1)
                
    return max_length
    
s = "bccbababd"
k = 2
print(getLengthofLongestSubstring(s, k))

Optimized Approach using Hashmap

The above approach will work fine but will result in a time limit exceeding after passing a limited case. The optimal approach can be to use a hashmap in this case with two pointers. Let us understand the approach and algorithm step by step.

  1. First I declared one hashmap that will store the element as key and its current count as value. Initialize two-pointer as start and end that points at a first position.
  2. Now what we will do is iterate using the end pointer till we reach the last character and add the character and increase its count by 1 in the hashmap.
  3. In the same loop after adding a character, check that the length of the hashmap must be less than or equal to k. If it is true then continue which means we are generating the longest substring with unique K characters.
  4. If the length of the map exceeds K, then we have to decrease the window so now we start decreasing the frequency of elements from a map using the start pointer till the length of the map is not less than or equal to k.
  5. In this way, we can find the maximum possible substring length with K unique characters in linear time.

I have tried solving one sample input on a page with this approach. Its glimpse is shown below which will help you resolve any query related to the approach.

💻 Code ~ Now let us implement the approach.

def getLengthofLongestSubstring(s, k):
    n = len(s)
    
    #If req unique char is greater then n then return n
    if k > n:
        return n
        
    #If string is empty or k is negative
    if s == "" or k <= 0:
        return -1
        
    #Else follow our discussed hashmap approach
    max_length = 0
    hashmap = dict()
    start = 0
    end = 0
    
    while end < n:
        curr_ch = s[end]
        if curr_ch in hashmap:
            hashmap[curr_ch] += 1
            
        else:
            hashmap[curr_ch] = 1
            
        #update the end pointer
        end += 1
            
        #if hashmap len greater than k
        while(len(hashmap) > k):
            #start removing ele using start pointer
            ch_front = s[start]
            hashmap[ch_front] -= 1
            
            #If freq is 0 then del the key(char)
            if hashmap[ch_front] == 0:
                del hashmap[ch_front]
                
            start += 1
        
        #Update length of curr_substr with less than equal k     
        max_length = max(max_length, (end - start))
    
    return max_length
    
s = "bccbababd" #babab -> 5
k = 2
print(getLengthofLongestSubstring(s, k))

5) Anagram difference between two strings

One more very interesting and important problem statement under strings that is mostly asked in interviews. Given two strings of equal length. your task is to find the minimum number of manipulations in string either string so both strings are anagrams of each other.

👉 An Anagram is a word that denotes the word formed by rearranging the letters of any other word. It means the sorted order of both words is equal.

Example

First-line contains an integer T denoting the number of test cases. For each test case, we have two lines of input. first-line contains string one and the second line contains the second string of lowercase alphabets.


Find minimum required changes to make two strings Anagram

Naive Approach

The basic one is always a brute-force method. Use two loops which first to loop over a first string and the second to find the particular element of the first loop in the second loop. If it is found till the end then move forward. And in last also not found then increase the result by one.

def getMinimumAnagramDifference(str1, str2):
    lst1 = list(str1)
    lst2 = list(str2)
    n = len(str1) #both length are equal
    
    min_diff = 0
    for i in range(n):
        j = 0
        
        #find lst1[i] in lst2
        while j < n:
            #if found then convert to # to preserve another match
            if lst1[i] == lst2[j]:
                lst2[j] = "#"
                break
                
            j += 1
            
        #If it is not found
        if j == n:
            min_diff += 1 #chang required here
            
    return min_diff
    
s1 = "gate"
s2 = "late"
print(getMinimumAnagramDifference(s1, s2))

An optimized approach to make two strings Anagram.

We can again use the frequency map (hashmap) to find the minimum number of changes required in string 2 to make it equal to string 1. First, generate a frequency map of string 1. The iterate in string 2 and check whether the character in the map exists or not. if it exists and the count is greater than 0 then decrease the count by one. Else simply increase the result.

💻 Code ~ Try to dry run the code using sample input on paper so you are 100 percent clear with the approach.

def getMinimumAnagramDifference(str1, str2):
    freq_map = dict()
    min_diff = 0
    for i in str1:
        if i in freq_map:
            freq_map[i] += 1
        else:
            freq_map[i] = 1
            
    #Now check whether char of str2 is in str1
    for ch in str2:
        if ch in freq_map and freq_map[ch] > 0:
            #decrease the freq of ch by 1 if char exist
            freq_map[ch] -= 1 
            
        #else chng require
        else:
            min_diff += 1
            
    return min_diff
    
s1 = "accept"
s2 = "except"
print(getMinimumAnagramDifference(s1, s2))

6) Match a specific pattern to a list of strings

Your friend is given a list of words and a pattern string. And the task is to find all the words from a list that match a given pattern. Your friend is from a non-technical background and solving manually a large list is a tough task for him. So he needs your help to build a program to find all such words from the list that matches a given pattern.

🔰NOTE ~ A valid match is found if and only if when each character in the pattern is uniquely mapped to a single character in a word. Observe the below examples for more clearance.

Example (Sample Input and Output)

The first line contains several test cases. For each test case, we have 3 lines of input. the first line denotes the number of words. the second line contains space-separated words and the third line contains a pattern string.


How to find a word that matches a unique pattern?

The approach is simple that we can find a numeric pattern for a given pattern. And then while iterating in a list we can find a numeric pattern for each word and compare the word patterns with a given pattern. If both are equal then add a word to the resultant list. Observe the below-solved sample input to understand the approach in a better way.


💻 Code ~ Now we can code it easily.

def findNumericPattern(strng):
    pattern_map = {}
    res_pattern = ""
    cnt = 1
    for i in strng:
        if i not in pattern_map:
            pattern_map[i] = cnt
            cnt += 1
            
        res_pattern = res_pattern + str(pattern_map[i])
        
    return res_pattern

def matchSpecificPattern(words, n, pattern):
    result = []
    #find the numeric pattern of given pattern
    pattern_map = findNumericPattern(pattern)
    
    for word in words:
        word_pattern = findNumericPattern(word)
        
        if word_pattern == pattern_map:
            result.append(word)
    
    return result
    
t = int(input())
while t > 0:
    n = int(input())
    words = input().split()
    pattern = input()
    result = matchSpecificPattern(words, n, pattern)
    print(result)
    t = t-1

7) Compare two strings and find a new version

After solving the above problems this problem might seem simple to you and I hope that you will be able to implement the approach. So the problem statement is given two version numbers as a string. Your task is to compare both versions and find the new version. Print 1 if version A is new, -1 if version B is new and 0 if both versions are the same.

🔰NOTE ~ There are no leading zero in any of the cases except the zero itself. Leading zeros are not allowed at any position of version like 121.007 is invalid while 121.7 is a valid version.

Example

The first line contains T denoting the number of test cases. For each test case, we have 2 lines of input in which the first line has the first version of the string and the second line contains the second version.


How to compare two strings to find a new version

The approach is very simple to build and hope you have built till now. If not then try to think of different scenarios (cases that can form). let us discuss the different cases to check and find a new version.

  1. The first is simple if both strings are equal then simply return 0 (no new version).
  2. Otherwise, we split both the versions based on the dot and check if the first digit of any string is greater than the other string's first digit returning the corresponding answer.
  3. Else find the length of both the list length and whichever length is small or equal iterate till that length and compare both the particular version digits. 
  4. If this also does not give the answer then it means at a certain length both strings have the same part and one of the string lengths is greater than the other. To find the greater length and iterate from small length to greater length and check that the part is simply greater than 0 because the current version is already bigger.

💻 Code ~ I hope the complete algorithm is clear to you and can be ready to implement the algorithm.

def compareVersions(a, b):
    #Step-1
    if a == b:
        return 0
    
    #STEP-2    
    lst_a = a.split('.')
    lst_b = b.split('.')
    
    if int(lst_a[0]) > int(lst_b[0]):
        return 1  #first version bigger
        
    elif int(lst_b[0]) > int(lst_a[0]):
        return -1  #second version bigger
        
    # STEP-3
    na = len(lst_a)
    nb = len(lst_b)
    small_len = na if na <= nb else nb
    
    for i in range(1, small_len):
        if int(lst_a[i]) > int(lst_b[i]):
            return 1  #first version bigger
            
        elif int(lst_b[i]) > int(lst_a[i]):
            return -1  #second version bigger
            
    #STEP-4 
    #It means till eql length both have same parts and 1 version is bigger
    if na > nb:
        for i in range(nb, na):
            #just check part is greater then 0
            if int(lst_a[i]) > 0:
                return 1
                
    elif nb > na:
        for i in range(na, nb):
            if int(lst_b[i]) > 0:
                return -1
    
    #else return 0            
    return 0
        
a = "123.45"
b = "123"
print(compareVersions(a, b))

8) Find the Kth Character of Decrypted String

Given an encrypted string in form of run-length encoding of string (the problem we have solved above) and an integer K, You have to find the character at position K of the given encrypted string when it will be in its original form after decryption. The answer should be in one-based indexing.

🔰NOTE ~ 

  • The input string is always in lowercase characters without any space.
  • Each substring will be followed by some integer count even 1.
  • The frequency of the substring can also be more than one digit. for example ab15g7, here ab is repeated 12 times.
  • The substring can also be in parts of the encrypted string given to you. for example a2a2b3.

Example

The first line of each test case contains an encrypted string. And the second line contains an integer K.


How to find the Kth character of the decrypted string (Approach)

The simple approach is to first find the decrypted string and return the Kth position character. But the solution will lead to high complexity and time limit exceed. Think over the same approach a little bit about how we can optimize the approach without decryption.

Approach without decryption

  • We will simply iterate through the encrypted string and keep track of the length of the decrypted substring.
  • Find the length of the substring while traversing till no digit is found.
  • Then find the frequency of substring by traversing till no lowercase alphabet is found. we can use this relation to create the length of the substring that contributes to the final decrypted string.
  • If the resultant length of the currently repeated substring is less than K then the required character is present in the later part of a string. So simply subtract the resultant length from K to know how many more characters we need to visit.
  • And if the resultant length is less than K then the required character lies in the current substring.
  • Keep repeating the approach till we reach the Kth character. And using this algorithm we can optimize the above approach.

💻 Code ~ now it's time to implement the algorithm.

def kThCharaterOfDecryptedString(s, k):
    n = len(s)
    i = 0
    
    freqOfSubstr = 0
    substrLength = 0
    
    while i < n:
        j = i
        while(j < n and s[j].islower()): #.isalpha()
            j += 1
            substrLength += 1
            
        while(j < n and s[j].isdigit()):
            j += 1
            freqOfSubstr = freqOfSubstr * 10 + (ord(s[j]) - ord('0'))
            
        #find len of current substring
        resultLength = substrLength * freqOfSubstr
        
        if(k > resultLength):
            #char is present in further part of encry strng
            k = k - resultLength
            i = j
            
        else:
            #char is present in current substr
            k = k - 1
            k = k % substrLength
            kthChar = s[i+k]
            break
        
    return kthChar

End Notes

These are all the top coding problems under string data type which are mostly asked in interviews by product-based companies, startups, and by small to large scale service-based industries. Based on these seven to eight problems we learned how to think, observe the right pattern for a problem and optimize the naive approach in linear time to reduce complexity in such a way as to think where we are repeating to visit the character so there is a way to visit only once and do calculations. 

👉 Now only one thing you need to do is go to any coding competitive sites of your choice and practice some questions under a similar category. If you are stuck forming a solution to any problem then solutions are uploaded but instead of observing the solution, you have to search for the approach to the same problem and then try to build it. It will help you maintain your spark and consistency.

Thank you for your time! 😊

Post a Comment

If you have any doubt or suggestions then, please let me know.

Previous Post Next Post