KK's blog

Amazon Prime Free Trial

This Prime Day, save up to 70% on speakers, smart TVs, smart watches and more!

Blink Outdoor Camera - Early Prime Day Deals

Ads by Amazon

LeetCode 269 Alien Dictionary

Posted on 2021-11-29 Edited on 2021-12-13 Disqus:

LeetCode

There is a new alien language that uses the English alphabet. However, the order among the letters is unknown to you.

You are given a list of strings words from the alien language’s dictionary, where the strings in words are sorted lexicographically by the rules of this new language.

Return a string of the unique letters in the new alien language sorted in lexicographically increasing order by the new language’s rules. If there is no solution, return "". If there are multiple solutions, return any of them.

A string s is lexicographically smaller than a string t if at the first letter where they differ, the letter in s comes before the letter in t in the alien language. If the first min(s.length, t.length) letters are the same, then s is smaller if and only if s.length < t.length.

Example 1:

Input: words = [“wrt”,”wrf”,”er”,”ett”,”rftt”]
Output: “wertf”

Example 2:

Input: words = [“z”,”x”]
Output: “zx”

Example 3:

Input: words = [“z”,”x”,”z”]
Output: “”
Explanation: The order is invalid, so return "".

Constraints:

1 <= words.length <= 100 1 <= words[i].length <= 100
* words[i] consists of only lowercase English letters.

算法思路：

N/A

注意事项：

题目要求：空字符顺序前于非空字母，否则字典不合法，如abc -> ab, c不能前于空字符，无解。简而言之，后面单词不能是前面的前缀return False if len(word) > len(word2) else True
模板问题： graph要含所有节点，包括没有边的节点。否则结果会有遗漏graph = Counter({c: [] for word in words for c in word})
模板问题： in_degree初始化要对所有节点赋0， in_degree[c] = 0。in_degree = collections.defaultdict(int)并不能产生key
模板问题：第四步判断是否含循环必不可少，题目要求可能不合法，return res if len(graph) == len(res) else ‘’
语法错误： graph.items()记得加items。res是str不是list

Python代码：

def alienOrder(self, words: List[str]) -> str:
	# graph = collections.defaultdict(list)
	graph = Counter({c: [] for word in words for c in word})
	for i in range(1, len(words)):
		if not self.populate_one_order(words[i - 1], words[i], graph):
			return ''
	in_degree = collections.defaultdict(int)
	for c in graph.keys():
		in_degree[c] = 0
	for key, li in graph.items():
		for j in range(len(li)):
			in_degree[li[j]] += 1
	res = ''
	queue = deque([node for node, in_degree_num in in_degree.items() if in_degree_num == 0])
	while queue:
		node = queue.popleft()
		res += node
		for neighbor in graph[node]:
			in_degree[neighbor] -= 1
			if in_degree[neighbor] == 0:
				queue.append(neighbor)
	return res if len(graph) == len(res) else ''

def populate_one_order(self, word, word2, graph):
	for j in range(min(len(word), len(word2))):
		if word[j] != word2[j]:
			graph[word[j]].append(word2[j])
			return True
	return False if len(word) > len(word2) else True

算法分析：

时间复杂度为O(V + E)，空间复杂度O(O + E)，V为节点数，E为边数，n为单词数，L为最长单词长度，O = nL， E = L，而空间复杂度图的空间为O + E，而queue空间最多为26条边，26个字母，in_degree空间为O(V)。所以总的时间复杂度为O(nL)，空间复杂度O(nL)

Trie

Posted on 2021-11-29 Edited on 2022-01-10 Disqus:

算法思路：

Leetcode 208 Implement Trie (Prefix Tree), 也可以用HashMap将所有前缀加入到Map来实现，效率稍低。

注意事项：

TrieNode用{}和is_end，insert, search, startswith用it和i迭代比较
startswith含整个单词，如单词apple，startswith(‘apple’) -> True
~~Line 11记得加，也就是dict在取value是一定要先检验key是否存在。~~可以不加，解决方案是用defaultdict（后版本）。

新版本比旧版本更简洁，体现在is_end的处理放在了for循环外。
存储结构举例： a

TrieNode(
  children = { 
              'a':  TrieNode(is_End = True)
		     }
  is_end = False
)

Python代码：


    def __init__(self):
        self.head = TrieNode()

    def insert(self, word: str) -> None:
        it = self.head
        for i in range(len(word)):
            # if word[i] not in it.children:
                #it.children[word[i]] = TrieNode()
            it = it.children[word[i]]
        it.is_end = True

    def search(self, word: str) -> bool:
        it = self.head
        for i in range(len(word)):
            if word[i] not in it.children:
                return False
            it = it.children[word[i]]
        return it.is_end

    def startsWith(self, prefix: str) -> bool:
        it = self.head
        for i in range(len(prefix)):
            if prefix[i] not in it.children:
                return False
            it = it.children[prefix[i]]
        return True

class TrieNode:

    def __init__(self):
        self.children = collections.defaultdict(TrieNode)  # {}
        self.is_end = False

旧版本：冗余了一个if语句if i == len(word) - 1

Python代码：

class Trie(TestCases):

        def __init__(self):
        self.head = TrieNode()

    def insert(self, word: str) -> None:
        if not word:
            return
        it = self.head
        for i in range(len(word)):
            # if word[i] not in it.children:
                # it.children[word[i]] = TrieNode()
            it = it.children[word[i]]
            if i == len(word) - 1:
                it.is_end = True

    def search(self, word: str) -> bool:
        if not word:
            return False
        it = self.head
        for i in range(len(word)):
            if word[i] not in it.children:
                return False
            it = it.children[word[i]]
            if i == len(word) - 1 and it.is_end:
                return True
        return False

    def startsWith(self, prefix: str) -> bool:
        if not prefix:
            return False
        it = self.head
        for i in range(len(prefix)):
            if prefix[i] not in it.children:
                return False
            it = it.children[prefix[i]]
            if i == len(prefix) - 1:
                return True
        return False

class TrieNode:

    def __init__(self):
        self.children = collections.defaultdict(TrieNode)  # {}
        self.is_end = False

算法分析：

每个操作时间复杂度为O(n)，空间复杂度O(n)，n为单词长度。

LeetCode 1048 Longest String Chain

Posted on 2021-11-27 Edited on 2022-01-28 Disqus:

LeetCode 1048 Longest String Chain

You are given an array of words where each word consists of lowercase English letters.

wordA is a predecessor of wordB if and only if we can insert exactly one letter anywhere in wordA without changing the order of the other characters to make it equal to wordB.

For example, "abc" is a predecessor of "abac", while "cba" is not a predecessor of "bcad".

A word chainis a sequence of words [word1, word2, ..., wordk] with k >= 1, where word1 is a predecessor of word2, word2 is a predecessor of word3, and so on. A single word is trivially a word chain with k == 1.

Return the length of the longest possible word chain with words chosen from the given list of words.

Example 1:

Input: words = [“a”,”b”,”ba”,”bca”,”bda”,”bdca”]
Output: 4
Explanation: One of the longest word chains is [“a”,”ba”,”bda”,”bdca”].

Example 2:

Input: words = [“xbc”,”pcxbcf”,”xb”,”cxbc”,”pcxbc”]
Output: 5
Explanation: All the words can be put in a word chain [“xb”, “xbc“, “cxbc”, “pcxbc”, “pcxbcf“].

Example 3:

Input: words = [“abcd”,”dbqca”]
Output: 1
Explanation: The trivial word chain [“abcd”] is one of the longest word chains.
[“abcd”,”dbqca”] is not a valid word chain because the ordering of the letters is changed.

Constraints: 1 <= words.length <= 1000
1 <= words[i].length <= 16 words[i] only consists of lowercase English letters.

Problem Overview

Get longgest one-character transformation in the given list

Thinking Process

This problem is similar to Word Break (reversed to get neighbors) but it is a multi-source longest path problem.

Steps

Loop through each word
BFS from each word and get the max distance
Get the max of distance

Notes

max_dis = 1 by default in BFS
To improve the complexity, make the distance map global so that the distance of each node will be calculated once.
To do that, sort the list from longest to shortest and make sure the it is greedy to get the max distance

Python代码：

def longestStrChain(self, words: List[str]) -> int:
	word_dict, distance = set(words), {}
	max_dis = 0
	words.sort(key=len, reverse=True)  # remember
	for word in words:
		dis = self.bfs(word, word_dict, distance)
		max_dis = max(max_dis, dis)
	return max_dis

def bfs(self, word, word_dict, distance):
	queue = deque([word])
	visited = {word}
	if word in distance:  # remember
		return distance[word]
	distance[word] = 1
	max_dis = 1
	while queue:
		node = queue.popleft()
		for neighbor in self.get_neighbors(node, word_dict):
			if neighbor in visited:
				continue
			queue.append(neighbor)
			visited.add(neighbor)
			distance[neighbor] = distance[node] + 1
			max_dis = max(max_dis, distance[neighbor])
	return max_dis

def get_neighbors(self, word, word_dict):
	res = []
	for i in range(len(word)):
		new_word = word[:i] + word[i + 1:]
		if new_word in word_dict:
			res.append(new_word)
	return res

Analysis

Though there is a loop and bfs like n^2, actually it is a traversal in a graph. n * L (n is # of nodes, L is max of path
single-source) is the num of edges. Another L is to get neighbors.
Time complexityO(nlogn + n*L²), Space complexityO(n).

DP

Posted on 2021-11-27 Edited on 2022-03-08 Disqus:

典型递归式
单序列或匹配型DP模板
数值到个数DP模板(最常考)
区间型DP模板
压缩空间
打印路径

典型递归式：

单序列型(与一个或多个前状态有关)
硬币个数(数值型)： dp[i] = min(dp[i], dp[i - j] + 1)
偷房子： dp[i] = max(dp[i-1], dp[i-2] + nums[i - 1])
LIS： dp[i] = max(dp[i], dp[j] + 1) 1 <= j < i
Work break： dp[i] |= dp[j] and s[j:i] in word_set, 0 <= j < i

坐标型DP(与上、左等两个前状态有关)

1	dp[n][m] = max{1, min(dp[n+1][m], dp[n][m+1]) - dungeon[n][m]}

匹配型(与上、左、左上等三个前状态有关)

1 2	正则： dp[i][j] = dp[i-1][j-1] && (p[j-1] == . \|\| s[i-1] == p[j-1]) OR ((dp[i-1][j] && (s[i-1] == p[j-2] \|\| p[j-2] == .)) \|\| dp[i][j-2]) && p[j-1] == *

区间型(与该区间内以k为划分的两个子区间有关)

1	dp[i][j] = max{dp[i][m] + nums[i]×nums[m]×nums[j] + dp[m][j]}, i < m < j

多状态型DP(DP有多个终止状态)

dp = dp     if s[i] = '0'
   = dp + 1 if s[i] = '1'
   
dp2 = min(dp2 + 1, dp + 1)     if s[i] = '0'
    = min(dp2, dp)             if s[i] = '1'

解题步骤：

9zhang按照DP四部曲

[初始化]dp数组
[实现]
[答案]
[优化]是否可以用滚动内存优化空间

单序列或匹配型DP模板

实现注意事项(5点)：

[多1][初始化]初始化矩阵，dp是原长度加1。因为令边界计算方便. 遍历从多少开始，取决于前状态多少个，若递归式含dp[i-1],从1开始，如果dp[i-2]就从2开始.数值型DP，如硬币个数DP长度是数值amount，不是数组长度。Python用dp = [[0 for _ in range(M)] for _ in range(N)], M为col数，再row数
[特殊初始值]是第一步的完善，若一维初始化dp0，若为矩阵，初始化dp[0][0]以及上和左边界。求个数dp[0] = 1, 求最值, dp[0] = 0因为递归式含+1
[多1]跳过初始值，i从1循环到len(dp)而不是len(s), 因为用了dp[0]占位。若不用占位法，就要用if语句判断第一位的边界。大部分递归式含i-1项，所以绝大部分从1开始，从什么开始取决于递归式，这也受第二步的影响
[少1]用到原数组s时候是s[i - 1]而不是s[i], 因为用了dp[0]占位。实现时候递归式和条件要合法，保证边界内，如dp[i - t] => i - t >=0和s[i - 1 - dp[i - 1] - 1]，见最长括号数
[答案]不一定是dp[-1]，如果以某位结束的答案而不是累积结果，一般就用max_len

累积DP递归公式

1	dp[n] = max(dp[n - 1], f)

Python代码：

def dp(self, s):
	dp = [False] * (len(s) + 1)
	dp[0] = True
	for i in range(1, len(dp)):
		...
	return dp[-1]

算法分析：

时间复杂度为O(n)，空间复杂度O(n)。

应用题型：

LeetCode 139 Word Break
LeetCode 044 Wildcard Matching
LeetCode 010 Regular Expression Matching

数值到个数DP模板(最常考)

求个数DP是比较简单的DP，但也是比较常考的。一般DP是前n个，这里是用数值，这点还是有点难的。

递归公式

1	dp[n + num[i]] = dp[n], n = [1, tgt], i = [0, len(nums) - 1]

注意事项：

dp[i + nums[j]] += dp[i] 而不是dp[i] + 1
dp[0] = 1表示数值为0，可以不用任何数就能获得，所以是1种
i从0开始，因为此题递归式用相加i + nums[j]，而不是相减
nums先排序，否则如[3, 1, 2, 4]，返回dp[1] = 0, 但应该是dp[1] = 1
上述4点注意事项只有1和2。数

def dp(self, nums, target):
	nums.sort() # remember
	dp = [0] * (target + 1)
	dp[0] = 1 # remember
	for i in range(len(dp)):
		for j in range(len(nums)):
			if i + nums[j] > target:
				break
			dp[i + nums[j]] += dp[i] # remember no +1
	return dp[-1]

应用题型：

LeetCode 377 Combination Sum IV

一个变体是求最小个数

递归公式

1	dp[n + coins[i]] = min{dp[n + coins[i]], dp[n] + 1}, n = [1, tgt], i = [0, len(nums) - 1]

不同之处：

递归式是dp[i] + 1
dp初始值为最大值，而且dp[0] = 0

def dp(self, coins: List[int], amount: int) -> int:
	coins.sort()
	dp = [float('inf')] * (amount + 1)
	dp[0] = 0
	for i in range(len(dp)):
		for j in range(len(coins)):
			if i + coins[j] > amount:
				break
			dp[i + coins[j]] = min(dp[i + coins[j]], dp[i] + 1)
	return dp[-1] if dp[-1] < float('inf') else -1  # remember

应用题型：

LeetCode 322 Coin Change
LeetCode 871 Minimum Number of Refueling Stops

算法分析：

时间复杂度为O(n*target)，空间复杂度O(n)。

区间型DP模板

k循环从1还是从2开始取决于递归式

def dp(self, s: str) -> int:
	# init dp array
	for k in range(1, len(s)):
		for i in range(len(s) - k):
			j = i + k
			# dp formula
	return dp[0][-1]

应用题型：

LeetCode 516 Longest Palindromic Subsequence
LeetCode 312 Burst Balloons

压缩空间

若递归式和上一行有关，绝大部分属于此情况，就可以用两行，原数量列来处理

与标准模板不同之处：

行个数为2
i仍然是遍历到所有行，所以不能用len(dp),而是len(text1) + 1.
dp行不用i，而是i % 2

def longestCommonContinuous(self, text1: str, text2: str) -> int:
	dp = [[0 for _ in range(len(text2) + 1)] for _ in range(2)]
	res = 0
	for i in range(1, len(text1) + 1):
		for j in range(1, len(dp[0])):
			if text1[i - 1] == text2[j - 1]:
				dp[i % 2][j] = dp[(i - 1) % 2][j - 1] + 1
				res = max(res, dp[i % 2][j])
	return res

应用题型：

Karat 002 Longest Common Continuous Subarray
LeetCode 063 Unique Paths II

打印路径

三种类型：

LCA: Karat 002 Longest Common Continuous Subarray res = history1[i - dp[i][j]:i]
LIS:
LeetCode 300 Longest Increasing Subsequence
LeetCode 368 Largest Divisible Subset
LCS: LeetCode 1143 Longest Common Subsequence

LIS过程：

path跟dp数组一样大，记录回溯跳转j -> i路径path[i - 1] = j - 1
记录回溯开始点if res == dp[i]: biggest_pos = i - 1
循环长度为res, 从biggest_pos开始

Python代码：

def print_path(self, nums, path[:biggest_pos + 1], dp_value = res):
	pos, res = len(path) - 1, []
	for _ in range(dp_value):
		res.append(nums[pos])
		pos = path[pos]
	return res[::-1]

LCS:

path和dp一样，不用特别数组
从右下到左上，若上或左值一样，向此方向移动，直到左和上少一，此时res -= 1

Python代码：

longest, res = dp[-1][-1], ''
while m >= 0 and n >= 0:
	if dp[m - 1][n] == longest:
		m -= 1
	elif dp[m][n - 1] == longest:
		n -= 1
	else:
		res += text1[m - 1]
		longest -= 1
		m -= 1
		n -= 1

LeetCode 2080 Range Frequency Queries

Posted on 2021-11-20 Edited on 2021-11-21 Disqus:

LeetCode 2080 Range Frequency Queries

Design a data structure to find the frequency of a given value in a given subarray.

The frequency of a value in a subarray is the number of occurrences of that value in the subarray.

Implement the RangeFreqQuery class:

RangeFreqQuery(int[] arr) Constructs an instance of the class with the given 0-indexed integer array arr.
int query(int left, int right, int value) Returns the frequency of value in the subarray arr[left...right].

A subarray is a contiguous sequence of elements within an array. arr[left...right] denotes the subarray that contains the elements of nums between indices left and right (inclusive).

Example 1:

**Input**
["RangeFreqQuery", "query", "query"]
[[[12, 33, 4, 56, 22, 2, 34, 33, 22, 12, 34, 56]], [1, 2, 4], [0, 11, 33]]
**Output**
[null, 1, 2]

**Explanation**
RangeFreqQuery rangeFreqQuery = new RangeFreqQuery([12, 33, 4, 56, 22, 2, 34, 33, 22, 12, 34, 56]);
rangeFreqQuery.query(1, 2, 4); // return 1\. The value 4 occurs 1 time in the subarray [33, 4]
rangeFreqQuery.query(0, 11, 33); // return 2\. The value 33 occurs 2 times in the whole array.

Constraints:

1 <= arr.length <= 105
1 <= arr[i], value <= 104
0 <= left <= right < arr.length
At most 105 calls will be made to query

题目大意：

设计数据结构，支持在指定的子数组中某target的频数。

解题思路：

这题有意思，有望成为经典题。第一种方法写用以每个字母为结尾的frequency_dict作为数据结构，query只需要用frequency_dict[right]-frequency_dict[left-1]，
空间复杂度为O(n²)得到TLE。第二种方法用bucket sort，就是将值作为key存到dict中，而value是下标List，query时候得到对应List，遍历
一次即可，但仍然得到TLE。第三种方法改进用Binary search得到下标List的左右界。二分法用greater_or_equal_position以及small_or_equal_position.
所以最终方案采取bucket sort + binary search

解题步骤：

dict记录值到下标列表的映射
二分法找left和right的index从而求个数

注意事项：

Binary search用greater_or_equal_position以及small_or_equal_position.

Python代码：

class RangeFreqQuery:
	
    def __init__(self, arr: List[int]):
        self.frequency_dict = {}

        for i, n in enumerate(arr):
            if n not in self.frequency_dict:
                self.frequency_dict[n] = [i]
            else:
                self.frequency_dict[n].append(i)

    def query(self, left: int, right: int, value: int) -> int:
        if value not in self.frequency_dict:
            return 0
        index_list = self.frequency_dict[value]
        '''
        count = 0
        for index in index_list:
            if left <= index <= right:
                count += 1
        '''
        left_pos = self.greater_or_equal_position(index_list, left)
        right_pos = self.smaller_or_equal_position(index_list, right)
        if left_pos == -1 or right_pos == -1:
            return 0
        return right_pos - left_pos + 1

    def greater_or_equal_position(self, nums: List[int], target: int) -> int:
        if not nums:
            return -1
        start, end = 0, len(nums) - 1
        while start + 1 < end:
            mid = start + (end - start) // 2
            if target > nums[mid]:
                start = mid
            elif target < nums[mid]:
                end = mid
            else:
                start = mid
        if nums[start] >= target:
            return start
        if nums[end] >= target:
            return end
        return -1

    def smaller_or_equal_position(self, nums: List[int], target: int) -> int:
        if not nums:
            return -1
        start, end = 0, len(nums) - 1
        while start + 1 < end:
            mid = start + (end - start) // 2
            if target > nums[mid]:
                start = mid
            elif target < nums[mid]:
                end = mid
            else:
                end = mid
        if nums[end] <= target:
            return end
        if nums[start] <= target:
            return start
        return -1

用defaultdict(list)和bisect来优化程序

Python代码：

def __init__(self, arr: List[int]):
	self.frequency_dict = collections.defaultdict(list)

	for i, n in enumerate(arr):
		self.frequency_dict[n].append(i)

def query(self, left: int, right: int, value: int) -> int:
	index_list = self.frequency_dict[value]
	left_pos = bisect.bisect(index_list, left - 1)
	right_pos = bisect.bisect(index_list, right)
	return right_pos - left_pos

算法分析：

时间复杂度为O(logn)，空间复杂度O(n)。