강화학습 이용해서 2D Bin Packing Problem 풀어보기

일단 가장 쉬운 REINFORCE로 해보기로 결정했다.

Environment 구현에서는 step Function 정의가 핵심이다.

step Function을 정의하기 위해서는 아래 개념의 정의가 필요하다.

action

state, statePrime

def action(self, action):
    #현 state에서 가능한 Action만 return        
    return action
      
        
def step(self, action):
    terminated = #종료조건 정의 필요

    if not terminated:
        self.state = #statePrime 정의
        reward = #?
    else:
        reward = #?
    
    return self.state, reward, terminated
Python
복사

전체소스

# Example usage
bin_width = 10
bin_height = 10
items = [(3, 3), (5, 2), (4, 4), (1, 9), (2, 3), (2, 3), (3, 3), (3, 3), (3,3), (10, 1)]  # List of items with their width and height

bin_packing = BinPacking2D(bin_width, bin_height, items)
bin_packing.solve()
Python
복사

Reference:

[1] The Bin Packing Problem by Google, https://developers.google.com/optimization/pack/bin_packing, Creative Commons Attribution 4.0 License